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About the cover figure 


The figure in the lower right-hand corner of the book cover illustrates the three 
themes from inside the book, wavelets, signals, and fractals: Firstly, this graph is one 
of the functions which are generated with the use of four magic numbers. Specifi- 
cally, these same numbers also start an algorithmic generation of Daubechies wavelet 
functions, and the reader is referred to Figures 7.5 and 7.16 on pages 120-121 and 
134 for more images, for mathematical background and explanations, more figures, 
as well as for theory and exercises in Chapter 7 itself. 

To the experts: Figure 7.5 includes the cover figure, and it represents the progres- 
sion of functions which starts with one of the four-tap cases, and which is governed 
by the so-called pyramid algorithm. The pyramid algorithm in turn is a delicate de- 
sign used in the generation of sequences of basis functions. These functions are then 
further scaled and used in the representation of wavelet packets. And the represen- 
tations and the choices leads to new bases, selections from “libraries of bases” and 
based on entropy considerations; hence probability! 

More probability: The random-walk approach to the analysis of pyramid algo- 
rithms in turn is where the calculus of probability comes into the mix. And signals: 
To begin with, the use of four magic numbers is by adaptation from an algorithmic 
design which was first used by engineers, and which is fundamental in signal pro- 
cessing; now more recently adapted to image processing as well. But the particular 
function on the cover also represents the kind of sound signals that feature a beat. 
A quick glimpse of the figure finally reveals its fractal nature: By this we mean that 
shapes in a picture are repeated at different scales up to similarity; and which further 
display an underlying algorithm. Example: Large-scale shapes which envelop similar 
shapes at smaller scales! 


Preface 


If people do not believe that mathematics is simple, it is only because they 
do not realize how complicated life is. -——John von Neumann 


While this is a course in analysis, our approach departs from the beaten path in some 
ways. Firstly, we emphasize a variety of connections to themes from neighboring 
fields, such as wavelets, fractals and signals; topics typically not included in a gradu- 
ate analysis course. This in turn entails excursions into domains with a probabilistic 
flavor. Yet the diverse parts of the book follow a common underlying thread, and to- 
gether they constitute a good blend; each part in the mix naturally complements the 
other. 

In fact, there are now good reasons for taking a wider view of analysis, for ex- 
ample the fact that several applied trends have come to interact in new and exciting 
ways with traditional mathematical analysis—as it was taught in graduate classes for 
generations. One consequence of these impulses from “outside” is that conventional 
boundaries between core disciplines in mathematics have become more blurred. 

Fortunately this branching out does not mean that students will need to start out 
with any different or additional prerequisites. In fact, the ideas involved in this book 
are intuitive, natural, many of them visual, and geometric. The required background 
is quite minimal and it does not go beyond what is typically required in most graduate 
programs. 

We believe that now is a good time to slightly widen the horizons of the subject 
“analysis” as we teach it by stressing its relations to neighboring fields; in fact we 
believe that analysis is thereby enriched. 

Despite the inclusion of themes from probability and even from engineering, 
the course still has an underlying core theme: A constructive approach to building 
bases in function spaces. The word “constructive” here refers to our use of recursive 
algorithms. As it turns out, the algorithmic ideas involved are commonly used in such 
diverse areas as wavelets, fractals, signal and image processing. And yet they share 
an underlying analysis core which we hope to bring to light. 


viii Preface 


Our inclusion here of some applied topics (bordering probability theory and en- 
gineering) we believe is not only useful in itself, but more importantly, core mathe- 
matics, and analysis in particular have benefited from their many interconnections to 
trends and influences from the “outside” world. 

Yet our wider view of the topic analysis only entails a minor adjustment in course 
planning. Our branching out to some applications will be guided tours: to topics 
from probability theory (e.g., to certain random-walk models), and to signal and 
image processing. The ideas are presented from scratch, are easy to follow, and they 
do not require prior knowledge of probability or of engineering. But we will go a 
little beyond the more traditional dose of measure theory and matrix algebra that is 
otherwise standard or conventional fare in most first-year graduate courses. 

For those reasons we believe the book may also be suitable for a “second analysis 
course,” and that it leaves the instructor a variety of good options for covering a 
selection of neighboring disciplines and applications in more depth. 


Iowa City, 
June 2006 Palle E. T. Jorgensen 
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Drawing by the author, next page: 

Wavelet algorithms are good for vast sets of numbers. 

An engineering friend described the old approach to data mining as 

“Just drop a computer down onto a gigantic set of unstructured numbers!” 
(data mining: see Section 6.2, pp. 102-105, and the Glossary, pp. xxiv-xxv). 
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Getting started 


From its shady beginnings devising gambling strategies and counting corp- 
ses in medieval London, probability theory and statistical inference now 
emerge as better foundations for scientific models, especially those of the 
process of thinking and as essential ingredients of theoretical mathematics, 
even the foundations of mathematics itself. —David Mumford 


An apology 


You ask: “Why all the fuss?” — Wavelets, signals, fractals? Isn’t all of this merely a 
fad? Or a transient popularity trend? And what’s the probability part in the book all 
about? And non-commuting operators? As for bases in linear spaces, what’s wrong 
with Gram—Schmidt? 

You may think: “Fourier has served us well for ages; so why do we need all the 
other basis functions?” — Wavelets and so on? — And why engineering topics in a 
mathematics course? And the pictures? Are they really necessary? 

And there are signal processing and image processing!? — Yes, technology is 
lovely, but why not leave it to the engineers? 

Response: The links between mathematics and engineering are much deeper than 
the fact that we mathematicians teach service courses for engineers. Our bread and 
butter! 

Mathematics draws ideas and strengths from the outside world, and the connec- 
tions to parts of engineering have been a boon to mathematics: From signal process- 
ing to wavelet analysis! That is true even if we forget about all of the practical appli- 
cations emerging from these connections. Without inspiration from the neighboring 
sciences, mathematics would in all likelihood become rather sterile, and overly for- 
mal. | see opportunities at crossroads. In this book you will see the benefits mathe- 
matics is reaping from trends and topics in engineering. It is witnessed in a striking 
way by exciting developments in wavelets. From wavelets we see how notions of 
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scale-similarity can be exploited in basis computations that use tricks devised for 
signal processing. Just open the book and glance at some of the wavelet functions. 
At the same time, the key notion of self-similarity, such as the scale-similarity used 
everywhere for wavelets, is essential to our understanding of fractals: Fern-like pic- 
tures that look the same at small and at large scales. One problem in the generation of 
wavelet bases is selecting the “nice” (here this means differentiable) wavelets among 
huge families of fractal-looking (non-smooth, or singular) functions. L?-functions 
can be very “bad” indeed!! Computers generate the good and the bad, and we are 
left with the task of sorting them out and making selections. We will see (directly 
from large libraries of pictures) that mathematical wavelet machines are more likely 
to spit out bad functions unless they are told where to concentrate the search from 
the intrinsic mathematics. 

These wavelets, signals, and fractals are things that have caught our attention 
in recent decades, but the mathematical part of this has roots back at least a hun- 
dred years, for example, to Alfred Haar and to Oliver Heaviside at the turn of the 
last century. From Haar we have the first wavelet basis, and with Heaviside we see 
the beginning of signal analysis. It is unlikely that either one knew about the other. 
Ironically, at the time (1909), Haar’s paper had little impact and was hardly noticed, 
even on the small scale of “notice” that is usually applied to mathematics papers. 
Haar’s wonderful wavelet only began to draw attention in the mid-nineteen-eighties 
when the connections to modern signal processing became much better understood. 
These connections certainly served as a main catalyst in what are now known as 
wavelet tools in pure and applied mathematics. But at the outset, the pioneers in 
wavelets had to “rediscover” a lot of stuff from signal processing: frequency bands, 
high-pass, low-pass, analysis and synthesis using down-sampling, and up-sampling, 
reconstruction of signals, resolution of images; all tools that have wonderful graphics 
representations in the engineering literature. 

But still, why would we think that Fourier’s basis, and his lovely integral decom- 
position, are not good enough? Many reasons: Fourier’s method has computational 
drawbacks. This was less evident before computers became common and began to 
play important roles in applied and theoretical work. But expansion of functions or 
signals into basis decompositions (called “analysis” in signal processing) involves 
basis coefficients (Fourier coefficients, and so on), and if we are limited to Fourier 
bases, then the computation of the coefficients must by necessity rely on integration. 
“Computers can’t integrate!” Hmmm! Well, not directly. The problem must first be 
discretized. And there is need for a more direct and algorithmic approach. Hence the 
wavelet algorithm! In any case, algorithms are central in mathematics even if you 
do not concern yourself with computers. And it is the engineering connections that 
inspired the most successful algorithms in our subject. 
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Glossary 


function, random variable, signal, state, sequence (incl. vector-valued), random walk, 
time-series, measurement, nested subspaces, refinement, multiresolution, scales of visual 
resolutions, operator, process, black box, observable (if selfadjoint), Fourier dual pair, 
generating function, time/frequency, P/Q, convolution, filter, smearing, decomposition 

(e.g., Fourier coefficients in a Fourier expansion), analysis, frequency components, integrate 
(e.g., inverse Fourier transform), reconstruct, synthesis, superposition, subspace, resolution, 
(signals in a) frequency band, Cuntz relations, perfect reconstruction from subbands, 
subband decomposition, inner product, correlation, transition probability, probability of 
transition from one state to another, fout = T fin, input/output, transformation of states, 
fractal, conditional expectation, martingale, data mining (A translation guide!) 


“The question is,” said Alice, “whether you can make words mean so many 
different things.” -—Lewis Carroll 


This glossary consists of a list of terms used inside the book in varied contexts of 
mathematics, probability, engineering, and on occasion physics. To clarify the seem- 
ingly confusing use of up to four different names for the same idea or concept, we 
have further added informal explanations spelling out the reasons behind the differ- 
ences in current terminology from neighboring fields. 

When sorting through the disparate variations of lingo in the sciences, the more 
mundane problems of plain and “ordinary” languages might seem minor: There is the 
variety of differences from Indo-European to the schizophrenia of Finno-Hungarian. 
And inside the same building on campus, there are even the variations in lingo that 
separate areas of mathematics: If you are a mathematician doing analysis in “plain 
English” you might well learn to understand the “Portuguese” spoken by your col- 
leagues in algebra; but when it comes to the Hungarian of an engineer, or the Bantu 
of some physicists, you can get lost or confused. A former student just wrote me 
about the language gulf between domestic mathematics and the jungle of industry. 
He now works in a company and is doing applied wavelets. And he writes that the 
language of implementation is quite different from that of our standard mathematics 
books. 


DISCLAIMER: This glossary has the structure of four columns. A number of terms 
are listed line by line, and each line is followed by explanation. Some “terms” have 
up to four separate (yet commonly accepted) names. The last four terms in the list, 
“fractal,” “conditional expectation,” “martingale,” and “data mining,” are the only 
ones where I could think of only one name. 

It should be added that my “descriptions” for the various terms are meant to stress 
intuitive aspects, as opposed to mathematical definitions. One reason for stressing the 
intuition behind the concepts is that there is not quite agreement about the precise 
meaning of some of the terms in the four fields: in mathematics, in probability, in 
engineering, and in physics. 
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function random variable _ signal state 
(measurable) 


Mathematically, functions may map between any two sets, say, from X to 
Y; but if X is a probability space (typically called Q), it comes with a o- 
algebra B of measurable sets, and probability measure P. Elements E in B 
are called events, and P(E) the probability of E. Corresponding measurable 
functions with values in a vector space are called random variables, a ter- 
minology which suggests a stochastic viewpoint. The function values of a 
random variable may represent the outcomes of an experiment, for example 
“throwing of a die.” In this simplest experiment, the range of the random 
variable is the set of integers from 1 to 6, the six possible measurements 
resulting from the experiment. 

Yet, function theory is widely used also in engineering where functions 
are typically thought of as signal. In this case, XY may be the real line for 
time, or R?. And I noticed that engineers visualize functions as signals. A 
particular signal may have a stochastic component, and this feature simply 
introduces an extra stochastic variable into the “signal,” for example noise. 

Turning to physics, in our present application, the physical functions will 
be typically be in some L?-space, and L?-functions with unit norm represent 
quantum mechanical “states.” 


sequence (incl. random walk time-series measurement 
vector-valued) 


Mathematically, a sequence is a function defined on the integers Z or on 
subsets of Z, for example the natural numbers N. Hence, if time is discrete, 
this to the engineer represents a time series, such as a speech signal, or any 
measurement which depends on time. But we will also allow functions on 
lattices such as Z7. 

In the case d = 2, we may be considering the grayscale numbers which 
represent exposure in a digital camera. In this case, the function (grayscale) 
is defined on a subset of Z?, and is then simply a matrix. 

A random walk on Z? is an assignment of a sequential and random motion 
as a function of time. The randomness presupposes assigned probabilities. 
But we will use the term “random walk” also in connection with random 
walks on combinatorial trees. 


nested refinement multiresolution scales of visual 
subspaces resolutions 


While finite or infinite families of nested subspaces are ubiquitous in math- 
ematics, and have been popular in Hilbert-space theory for generations (at 
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least since the 1930s), this idea was revived in a different guise in 1986 by 
Stéphane Mallat, then an engineering graduate student; see [Mal89]. In its 
adaptation to wavelets, the idea is now referred to as the multiresolution 
method. 

What made the idea especially popular in the wavelet community was that 
it offered a skeleton on which various discrete algorithms in applied mathe- 
matics could be attached and turned into wavelet constructions in harmonic 
analysis. In fact what we now call multiresolutions have come to signify 
a crucial link between the world of discrete wavelet algorithms, which are 
popular in computational mathematics and in engineering (signal/image pro- 
cessing, data mining, etc.) on the one side, and on the other side continuous 
wavelet bases in function spaces, especially in Z?(IR?). Further, the mul- 
tiresolution idea closely mimics how fractals are analyzed with the use of 
finite function systems. 

But in mathematics, or more precisely in operator theory, the underlying 
idea dates back to work of John von Neumann, Norbert Wiener, and Herman 
Wold, where nested and closed subspaces in Hilbert space were used exten- 
sively in an axiomatic approach to stationary processes, especially for time 
series. Wold proved that any (stationary) time series can be decomposed into 
two different parts: The first (deterministic) part can be exactly described by 
a linear combination of its own past, while the second part is the opposite 
extreme; it is unitary, in the language of von Neumann. 

von Neumann’s version of the same theorem is a pillar in operator theory. 
It states that every isometry in a Hilbert space H is the unique sum of a 
shift isometry and a unitary operator, i.e., the initial Hilbert space H splits 
canonically as an orthogonal sum of two subspaces H,; and H,, in H, one 
which carries the shift operator, and the other H,, the unitary part. The shift 
isometry is defined from a nested scale of closed spaces V,, such that the 
intersection of these spaces is H,,. 

However, Stéphane Mallat was motivated instead by the notion of scales 
of resolutions in the sense of optics. This in turn is based on a certain 
“artificial-intelligence” approach to vision and optics, developed earlier by 
David Marr at MIT, an approach which imitates the mechanism of vision in 
the human eye. 

The connection from these developments in the 1980s back to von Neu- 
mann is this: Each of the closed subspaces VY, corresponds to a level of 
resolution in such a way that a larger subspace represents a finer resolution. 
Resolutions are relative, not absolute! In this view, the relative complement 
of the smaller (or coarser) subspace in larger space then represents the visual 
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detail which is added in passing from a blurred image to a finer one, i.e., to 
a finer visual resolution. 

This view became an instant hit in the wavelet community, as it offered a 
repository for the fundamental father and the mother functions, also called 
the scaling function g, and the wavelet function y. Via a system of transla- 
tion and scaling operators, these functions then generate nested subspaces, 
and we recover the scaling identities which initialize the appropriate al- 
gorithms. What results is now called the family of pyramid algorithms in 
wavelet analysis. The approach itself is called the multiresolution approach 
(MRA) to wavelets. And in the meantime various generalizations (GMRAs) 
have emerged. 

In all of this, there was a second “accident” at play: As it turned out, 
pyramid algorithms in wavelet analysis now lend themselves via multires- 
olutions, or nested scales of closed subspaces, to an analysis based on fre- 
quency bands. Here we refer to bands of frequencies as they have already 
been used for a long time in signal processing. 

Even though J. von Neumann and H. Wold had been using nested or 
scaled families of closed subspaces in representing past and future for time 
series, S. Mallat found that this same idea applies successfully to the repre- 
sentation of visual resolutions. And even more importantly, it offers a variety 
of powerful algorithms for processing of digital images. 

Now parallel to all of this, pioneers in probability theory had in fact devel- 
oped versions of the same refinement analysis. For example, in the theory 
of martingales, consistency relations may naturally be reformulated in the 
language of nested subspaces in Hilbert space. 

One reason for the success in varied disciplines of the same geometric 
idea is perhaps that it is closely modeled on how we historically have repre- 
sented numbers in the positional number system; see, e.g., [Knu81]. Analo- 
gies to the Euclidean algorithm seem especially compelling. 


operator process black box observable 


(if selfadjoint) 


In linear algebra students are familiar with the distinctions between (linear) 
transformations 7 (here called “operators”) and matrices. For a fixed opera- 
tor T: V — W, there is a variety of matrices, one for each choice of basis in 
V and in W. In many engineering applications, the transformations are not 
restricted to be linear, but instead represent some experiment (“black box,” 
in Norbert Wiener’s terminology), one with an input and an output, usually 
functions of time. The input could be an external voltage function, the black 
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box an electric circuit, and the output the resulting voltage in the circuit. 
(The output is a solution to a differential equation.) 

This context is somewhat different from that of quantum mechanical 
(QM) operators T:V — V where V is a Hilbert space. In QM, selfad- 
joint operators represent observables such as position Q and momentum P, 
or time and energy. 


Fourier dual generating time/frequency PIQ 
pair function 


The following dual pairs position Q/momentum P, and time/energy may be 
computed with the use of Fourier series or Fourier transforms; and in this 
sense they are examples of Fourier dual pairs. If for example time is discrete, 
then frequency may be represented by numbers in the interval [ 0, 27); or in 
[ 0, 1) if we enter the number 27 into the Fourier exponential. Functions of 
the frequency are then periodic, so the two endpoints are identified. In the 
case of the interval [ 0, 1), 0 on the left is identified with 1 on the right. So 
a low frequency band is an interval centered at 0, while a high frequency 
band is an interval centered at 1/2. Let a function W on [0, 1) represent 
a probability assignment. Such functions W are thought of as “filters” in 
signal processing. We say that W is low-pass if it is 1 at 0, or if it is near 
1 for frequencies near 0. Low-pass filters pass signals with low frequencies, 
and block the others. 

If instead some filter W is 1 at 1/2, or takes values near | for frequen- 
cies near 1/2, then we say that W is high-pass; it passes signals with high 
frequency. 


convolution — filter smearing 


Pointwise multiplication of functions of frequencies corresponds in the 
Fourier dual time-domain to the operation of convolution (or of Cauchy 
product if the time-scale is discrete.) The process of modifying a signal with 
a fixed convolution is called a linear filter in signal processing. The corre- 
sponding Fourier dual frequency function is then referred to as “frequency 
response” or the “frequency response function.” 

More generally, in the continuous case, since convolution tends to im- 
prove smoothness of functions, physicists call it “smearing.” 
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decomposition — analysis frequency 
(e.g., Fourier components 
coefficients in a 

Fourier expansion) 


Calculating the Fourier coefficients is “analysis,” and adding up the pure 
frequencies (i.e., summing the Fourier series) is called synthesis. But this 
view carries over more generally to engineering where there are more oper- 
ations involved on the two sides, e.g., breaking up a signal into its frequency 
bands, transforming further, and then adding up the “banded” functions in 
the end. If the signal out is the same as the signal in, we say that the analy- 
sis/synthesis yields perfect reconstruction. 


integrate reconstruct synthesis superposition 
(e.g., inverse 
Fourier transform) 


Here the terms related to “synthesis” refer to the second half of the kind of 
signal-processing design outlined in the previous paragraph. 


subspace — resolution (signals in a) 
frequency band 


For a space of functions (signals), the selection of certain frequencies serves 
as a way of selecting special signals. When the process of scaling is in- 
troduced into optics of a digital camera, we note that a nested family of 
subspaces corresponds to a grading of visual resolutions. 


Cuntz relations — perfect subband 
reconstruction decomposition 
from subbands 


The operator relations of Joachim Cuntz described in detail in Chapter 7 
serve to make precise the kind of signal-processing design outlined in the 
previous paragraph. Although anticipated by engineers, the Cuntz relations 
give a particularly elegant way for us to understand the geometry of iterated 
subdivision schemes which include a variety of models with perfect recon- 
struction from signal and image processing. 
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inner product correlation transition probability 
probability of transition 
from one state 
to another 


In many applications, a vector space with inner product captures perfectly 
the geometric and probabilistic features of the situation. This can be axiom- 
atized in the language of Hilbert space; and the inner product is the most 
crucial ingredient in the familiar axiom system for Hilbert space. 


Sout = Tfin _ input/output transformation 
of states 


Systems theory language for operators 7: V — W. Then vectors in V are 
input, and in the range of 7 output. 


fractal — — — 


Intuitively, think of a fractal as reflecting similarity of scales such as is seen 
in fern-like images that look “roughly” the same at small and at large scales. 
While there may not be agreement about a rigorous mathematical definition, 
Mandelbrot originally defined fractals as sets whose Hausdorff—Besicovich 
dimension exceeded their topological dimension, but later accepted all self- 
similar, self-affine, or quasi-self-similar sets as fractals. Moreover, even 
more generally, the self-similarity could refer alternately to space, and to 
time. And further versatility was added, in that flexibility is allowed into the 
definition of “similar.” 

In this book, our focus is more narrowly on the self-affine variant (where 
computations are relatively simple); but in addition, we encounter the fractal 
concept in other contexts, e.g., for measures, and for probability processes 
(such as fractal Brownian motion). Further, we have stressed examples more 
than the general theory. 

The fractal concept for measures is especially agreeable in the self-affine 
case, since affine maps act naturally on measures. So for each fractal di- 
mension s, there is a corresponding s-fractal probability measure u = ds 
(see Chapter 4) which is the unique solution to a natural fixed-point equa- 
tion, one which depends on s. Moreover, we may then recover this way the 
spatial fractal set itself as the support of this measure j. 

As for s-fractal Brownian motion (fBm), the “s-fractal” feature there 
refers to how the position X; at time ¢ of the fBm-process transforms under 
scaling of t: If time ¢ scales by c, then the respective distributions before and 
after scaling are related by the power-law c*. Specifically, for all ¢, the finite 
distributions calculated for X,; and for c*_X;, coincide. 
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—_— conditional —_— — 
expectation 


Let (Q, A, P) be a probability space. 

Recall that a function ¥:Q — R (the reals) is an A-random variable if 
it is measurable with respect to the two o-algebras on the domain and the 
target, i.e., A on Q, and B := the standard Borel o-algebra on R. For an 
A-random variable X, let E (X) := JoX dP, or simply (X) denote the 
corresponding mean (or expected value). If D is a sub-o-algebra of A, de- 
fine a measure ux on D by wu (S) := [>XaP for S e D. Then denote by 
E(X | D] the Radon—Nikodym derivative of 1 with respect to P. 

Let Y be a second 4-random variable, and let D := Dy = Y~! (B) be the 
smallest o-algebra with respect to which Y is measurable. Then clearly D 
is a sub-c-algebra of A, and we set E(X|Y¥):= E[X|D]=(X|Y). 
This extends naturally to the case when there is a finite family of A-random 
variables in place of Y. 


martingale —_— — 


We use the term here only in its simplest form, the discrete case, called 
“discrete-time:” Let (Q,.A, P) be a probability space. The following iden- 
tity defines a martingale. Consider a sequence of A-random variables Xo, 
X 1, ... each with finite mean. We say it is a martingale if for each n, 
the conditional expectation of Xn+1 given Xo, X1, ..., Xn is Xn, i.e, 
(Xn+1 | X0,---;Xn) = Xn (see Feller’s book [Fel71, p. 210]). The term 
was first used to describe a type of wagering in which the bet is doubled or 
halved after a loss or win, respectively. The concept of martingales is due to 
Lévy, and it was developed extensively by Doob [Doo94]. 

A random walk on the integers with each successive step left or right 
equally likely, i.e., with equal transition probabilities (pz = pr = 1/2), is 
an example of a martingale. 


— data mining — 


The problem of how to handle and make use of large volumes of data is a 
corollary of the digital revolution. As a result, the subject of data mining it- 
self changes rapidly. Digitized information (data) is now easy to capture au- 
tomatically and to store electronically [HuTK05]. In science, in commerce, 
and in industry, data represents collected observations and information: In 
business, there is data on markets, competitors, and customers [AgKu04a, 
AgKu04b]. In manufacturing, there is data for optimizing production op- 
portunities, and for improving processes [Kus02, Kus05]. A tremendous 
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potential for data mining exists in medicine [KuLDO1, KuDS05], genetics 
[ShKu04], and energy [KuBu05]. But raw data is not always directly us- 
able, as is evident by inspection. A key to advances is our ability to extract 
information and knowledge from the data (hence “data mining”), and to un- 
derstand the phenomena governing data sources. Data mining is now taught 
in a variety of forms in engineering departments, as well as in statistics and 
computer science departments. 

One of the structures often hidden in data sets is some degree of scale. 
The goal is to detect and identify one or more natural global and local scales 
in the data. Once this is done, it is often possible to detect associated similar- 
ities of scale, much like the familiar scale-similarity from multidimensional 
wavelets, and from fractals. Indeed, various adaptations of wavelet-like al- 
gorithms have been shown to be useful. These algorithms themselves are 
useful in detecting scale-similarities, and are applicable to other types of 
pattern recognition. Hence, in this context, generalized multiresolutions of- 
fer another tool for discovering structures in large data sets, such as those 
stored in the resources of the Internet. Because of the sheer volume of data 
involved, a strictly manual analysis is out of the question. Instead, sophis- 
ticated query processors based on statistical and mathematical techniques 
are used in generating insights and extracting conclusions from data sets. 
But even such an approach breaks down as the quantity of data grows and 
the number of dimensions increases. Instead there is a new research area 
(knowledge discovery in databases (KDD)) which develops various tools 
for automated data analysis. 

However, statistics is still at the heart of the problem of inference from 
the data. The widespread use of statistics, pattern recognition, and machine- 
learning algorithms is somewhat hindered in many areas by our ability to 
collect large volumes of data. The next limitation in the subject arises when 
the data is too large to fit in the main computer memory. As a result, we are 
faced with new issues, e.g., quality of data, creative data analysis, and data 
transformation [Kus01]. Theory and hypothesis formation now becomes 
critical in our task of deriving insights into underlying phenomena from the 
raw data. Various adaptations of wavelet-like algorithms have again proved 
useful in detecting scale-similarities, and in other types of pattern recogni- 
tion. Hence in this context wavelet ideas offer another tool for discovering 
structures in vast data sets, such as those in the resources of the Internet. 
And there are now a variety of such effective Web mining tools in use. 

Areas of data mining include problems of representation, search complex- 
ity, and automated use of prior knowledge to help in a data search. Thus we 
see the beginnings of a new science for efficient inference from massive data 
sets. 
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If you stop to think about it, it is really ironic that Haar’s wavelet basis (Figures 
1.2 and 7.4, pp. 13 and 118-119) was missed for so long. It is especially ironic 
since Haar’s work in 1909-1910 had in it implicitly the key idea which got wavelet 
mathematics started on a roll 75 years later with Yves Meyer, Ingrid Daubechies, 
Stéphane Mallat, and others—namely the idea of a multiresolution. In that respect 
Haar was ahead of his time. In 1909, Haar’s measure on non-abelian groups became 
much more widely known. 

Yet, returning to wavelets, the word “multiresolution” suggests a connection to 
optics from physics. So that should have been a hint to mathematicians to take a 
closer look at trends in signal and image processing! Moreover, even staying within 
mathematics, it turns out that as a general notion this same idea of a “multiresolu- 
tion” has long roots in mathematics, even in such modern and pure areas as opera- 
tor theory and Hilbert-space geometry. And in probability theory and in dynamics, 
A.N. Kolmogorov and J. Doob had already long ago identified martingales, again 
close cousins of multiresolutions. Looking even closer at these interconnections, we 
can now recognize scales of subspaces (so-called multiresolutions) in classical algo- 
rithmic construction of orthogonal bases in inner-product spaces, now taught in lots 
of mathematics courses under the name of the Gram—Schmidt algorithm. Indeed, a 
closer look at good old Gram—Schmidt reveals that it is a matrix algorithm, Hence 
new mathematical tools involving non-commutativity! Obviously, function spaces 
are infinite-dimensional. Since Gram—Schmidt is recursive, it does not stop, at least 
not until we tell it to stop. We do that when the basis expansion has achieved a “good 
enough” approximation to the true function, or the true signal or image which is 
being analyzed. 

Approximation? So we must retain the “significant” terms in an analysis expan- 
sion and throw out the other terms! To know which is “significant,” thresholds must 
be assigned, and probabilities, and even entropy, from information theory must be 
used. 

A Wiener process, carried to the limit, gives a nowhere differentiable continuous 
Brownian path, parametrized by time. This is a model for a physical Brownian tra- 
jectory of an actual particle. Actual Brownian particles do not follow paths that are 
precisely of this nature. 

If the signal to be analyzed is an image, then why not select a fixed but suit- 
able resolution (or a subspace of signals corresponding to a selected resolution), and 
then do the computations there? Of course, the selection of a fixed “resolution” is 
dictated by practical concerns. That idea was key in turning computation of wave- 
let coefficients into iterated matrix algorithms. As the matrix operations get large, 
the computation is carried out in a variety of paths arising from big matrix products. 
Such paths have been studied in probability since Kolmogorov in the 1930s, but paths 
are perhaps better known in their continuous variants. Yet, what we know about the 
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continuous case is the result of limit considerations arising from the discrete case. 
The dichotomy, continuous vs. discrete, is quite familiar to engineers. The industrial 
engineers typically work with huge volumes of numbers. 

Numbers! — So why wavelets? Well, what matters to the industrial engineer 
is not really the wavelets, but the fact that special wavelet functions serve as an 
efficient way to encode large data sets—I mean encode for computations. And the 
wavelet algorithms are computational. They work on numbers. Encoding numbers 
into pictures, images, or graphs of functions comes later, perhaps at the very end of 
the computation. But without the graphics, | doubt that we would understand any of 
this half as well as we do now. The same can be said for the many issues that relate 
to the crucial mathematical concept of self-similarity, as we know it from fractals, 
and more generally from recursive algorithms. 


Prerequisites and cross-audience 


I have used preliminary versions of this book in my courses. These courses fit the bill: 
“a second course in analysis,” in one form or the other. Some of my courses were 
more traditional, and for mathematics students, while others served a quite mixed 
audience, including students from engineering. At my university, there is a serious 
demand for interdisciplinary mathematics courses, not only involving engineers, but 
also students from physics, statistics, computer science, finance, and more. And I 
expect that this is a national trend. However, we found that many traditional math- 
ematics texts are too narrowly specialized and not well suited for cross-audiences. 
With that in mind, we have attempted a wider focus even within mathematics proper; 
specifically, we feel that a graduate text in analysis should be relevant to compu- 
tational mathematics and to numerical analysis. Our present emphasis on wavelets, 
signals and fractals invites such a wider focus. 

The students in my courses typically had some familiarity with function theory 
and measures. But their background was always diverse and varied. To accommodate 
diverse audiences (referring to the level of mathematical maturity, specialization, 
etc.), | included in my courses, and in the book, facts from a variety of topics. 

How? Some of the exercises serve this purpose, assimilation. For example, Ex- 
ercises 1.11 through 1.14 have the flavor of tutorials; they let students pick up on 
some quite basic but central points from Fourier series, inner-product spaces, linear 
transformations, matrices (finite and infinite), and Hilbert space. Working the exer- 
cises is the best way for students to learn and review fundamentals! In these multipart 
exercises, the reader is then guided step by step through the issues, but as they are 
needed in the text. As other basic topics are needed later in the book, there are then 
other multipart exercises that accomplish the same goal: e.g., Exercise 2.6 (integral 
operators), Exercises 2.7 through 2.10, and 9.8-9.9 (Brownian motion, including 
the fractional variant), Exercises 7.2 through 7.7 (matrix theory), and Exercises 7.9 
through 7.11 (tensor product of Hilbert space, and product measure). An advantage 
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of this approach is that students then see and learn these preliminaries precisely in 
the form in which they are used in the course. 

When I was teaching mixed groups of students, the engineers in my class typi- 
cally came from the following departments, EE (electrical engineers), electrical and 
computer engineering, communications, mechanical, industrial engineering, data 
mining, sensors—among others! Generally, the engineers looked for an agreeable 
mathematical presentation of central ideas from signal and image processing. (“Why 
then wavelets?” you may say. Reason: Wavelet algorithms share a lot in common 
with signal/image processing schemes! And they are used by engineers!) 

Some of the engineers in my course worked at our University Hospital, a large 
teaching hospital. Their projects involve designing and improving technology in 
medical imaging. 

I recently gave one of my thesis students (with a good background in mathemat- 
ics and engineering) the following assignment: “Work out the mathematics used in 
the processing of color images in a digital camera!” Reason: This is actually lovely 
algorithmic matrix theory, and it is based directly on wavelet algorithms. 

Challenge! There is a huge difference in jargon used by engineers and by math- 
ematicians (not to mention the other disciplines!); and often there is a call for some 
serious translation of technical lingo before you discover that the two groups are talk- 
ing about the same thing. At times, my course would begin with first overcoming the 
cultural and the communication barriers; hence the system of interrelated appendices 
on polyphase matrices in the back of the book, and the /ist of names and discoveries 
on pages xxx~xxxiii below. 


Aim and scope 


The aim of this book is to show how to use processes from probability, random walks 
on branches, and their path-space measures in the study of convergence questions 
from harmonic analysis, with particular emphasis on the infinite products that arise 
in the analysis of wavelets. 

The focus is by nature interdisciplinary, and it is motivated by some new mathe- 
matical trends that are still somewhat in a state of flux. We outline how they combine 
diverse areas from mathematics and engineering in unexpected ways. As a result, we 
aim to address a diverse audience, perhaps unusually diverse, ranging from pure to 
applied (engineering and physics), from probability (random walk) to analysis (in- 
finite products), from wavelets to fractals, from linear to non-linear, from function 
theory to non-abelian operator algebras, from Lebesgue to Hausdorff measure, and 
from classical (Fourier) to modern (wavelets and more generally scale-similarity in 
time and space). This diversity presents us with a challenge, and we have taken pains 
to articulate the interconnections, describe a coherent unity, and address the union 
(and not the intersection) of the various groups of readers who have an interest in 
this area of mathematics. Our exposition is designed to reach the beginning graduate 
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student, having in mind both students and workers meeting at the crossroads in a 
variety of fields. 


Self-similarity 


The idea of self-similarity is common to the construction of wavelets, various frac- 
tals, and graph systems. These subjects may seem diverse, and they are often thought 
of as disparate. However, we hope to outline how the notion of self-similarity runs 
through these subjects as a red thread connecting central themes: themes from har- 
monic analysis, discrete mathematics, probability, and operator algebras. In addition, 
we list the two papers [BoNe03] and [Nek04] as sources that cover this from a some- 
what different but related angle. 

Indeed, the concept of self-similarity has proved ubiquitous as well as fundamen- 
tal in mathematics and in diverse applications, perhaps because it serves to renormal- 
ize a rich variety of structures on nested families of scales (for example, similarity 
scales in time and/or in space). In wavelet theory, the scales may be represented in 
resolutions (a name deriving from optics) taking the form of nested systems of linear 
spaces. Similar such systems occur in fractal theory and in dynamics. And in quan- 
tum field theory [Nel73], the self-similarity idea underlies the notion of “renormal- 
ization group,” while in C*-algebra theory [Dav96], it gives rise to representations 
of algebras on relations and generators, such as operator algebras on infinite particle 
systems [PoSt70], or the algebras of Cuntz, or Cuntz—Krieger. In fact, the pioneering 
paper [Bat87] early on stressed a fascinating connection between the renormaliza- 
tion group in quantum physics on the one hand, and a certain fundamental wavelet 
construction on the other. 

To help the reader better see the bigger picture which we wish to present, we have 
included a system of interrelated appendices outlining a certain geometric approach 
to subband filtering which is both more operator-theoretic and more general than 
is usual. Our appendices build a bridge between two ways of doing wavelet/fractal 
subdivision algorithms, one based on quadrature conditions and their generalizations, 
and the other based on unitary operator functions and on representations of a certain 
C*-algebra. 

Recent developments in operator theory, approximation theory, orthogonal ex- 
pansions, and wavelets demonstrate the need for a combination of techniques from 
analysis and probability. Since the relevant tools and techniques for these new ap- 
plications are often interdisciplinary, they are not always readily available in the 
standard textbook literature. Or if they are, they are scattered over a number of older 
books or papers which are written for other purposes and aimed at other applications. 
The analysis problems that we focus on here are typically not presented together with 
their counterparts from probability theory: random walk, path space, infinite prod- 
ucts, martingales, Kolmogorov extension techniques in their classical form, and in 
some of their modern non-commutative versions. 
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This book aims to fill an apparent gap in the literature, or at least to help bridge 
a gap. It gives a rigorous treatment of a number of convergence questions, and it 
also includes some new results. Our use of analysis in this book relies heavily on 
probabilistic tools, and the book offers a presentation of them. 


New issues, new tools 


This book is primarily about wavelets, but it takes a new direction in the subject: 
While it is about convergence issues, the questions come from applications with a 
twist that is different from the one that has dominated the literature. 

The aim of this book is broader. We wish to show how some methods from prob- 
ability, operator algebras, and random-walk theory can be used to prove theorems in 
analysis. The ideas from probability that we will use include processes, random walk 
on branches, and path-space measures associated with them. The recurring theme 
will be the use of these ideas from probability in the study of convergence questions 
from harmonic analysis. Our emphasis is on the kinds of infinite products that arise 
in the analysis of wavelets and more generally in the harmonic analysis of iterated 
function systems, as well as in dynamical systems. 

Many modern wavelet constructions from the past few years are by necessity 
frequency-localized; and because of that fact, the convergence issues and the tools 
that must be employed to resolve them are completely different from the more tradi- 
tional ones that were developed and tailored in the 1980s to time-localized functions, 
i.e., the wavelets that have the scaling function (the father function) and the wave- 
let function (the mother function) both of compact support. Those are the wavelets 
where multiresolution tools have been especially successful. 

By contrast, the frequency-localized wavelets have compact support only in the 
Fourier-dual variable, and their resolution subspaces are typically not singly gener- 
ated, so the more traditional multiresolution methods have fallen short, at least in the 
form in which they originally were given. 

What this means is that some of the fundamental issues concerning pointwise 
convergence in the theory must be attacked with quite different tools: in this case, 
with tools from random walk, probability, and the theory of diffusions, processes, 
and martingales. 

So the book is on the interface and applications of probability, random walk, and 
path-space measures to convergence questions in harmonic analysis and dynamics. 


List of names and discoveries 


Many of the main discoveries summarized below are now lore, and they are often 
used inside the book without explicit mention of the pioneers by name. (For page 
numbers of the indicated chapters and figures, refer to Contents on pages ix—xiii 
above and Figures on pages xxxix—xlii below.) 


1807 

Jean Baptiste Joseph 
Fourier 
mathematics, physics 
(heat conduction) 


1909 
Alfred Haar 
mathematics 


1946 

Denes Gabor 

(Nobel Prize): physics 
(optics, holography) 


1948 

Claude Elwood Shannon 
mathematics, engineering 
(information theory) 


1935 

Andrei Nikolaevich 
Kolmogorov 
information theory, 
probability, statistics 


1976 

Claude Garland, Daniel 
Esteban (both) 

signal processing 


List of names and discoveries 


Expressing functions as sums of sine 
and cosine waves of frequencies in arith- 
metic progession (now called Fourier se- 
ries). Received somewhat late recogni- 
tion for his work. 


Discovered, while a student of David 
Hilbert, an orthonormal basis consist- 
ing of step functions, applicable both to 
functions on an interval, and functions 
on the whole real line. While it was not 
realized at the time, Haar’s construction 
was a precursor of what is now known as 
the Mallat subdivision, and multiresolu- 
tion method, as well as the subdivision 
wavelet algorithms. 


Discovered basis expansions for what 
might now be called time-frequency 
wavelets, as opposed to time-scale wave- 
lets, the subject of this book. 


A rigorous formula used by the phone 
company for sampling speech sig- 
nals. Quantizing information, entropy, 
founder of what is now called the math- 
ematical theory of communication. 


Pioneered the use of probability theory 
in analysis, especially the use of mea- 
sures on path space, the Kolmogorov 
consistency relation. 


Discovered subband coding of digital 
transmission of speech signals over the 
telephone 


XXxi 


Section 
6.2 


[Zyg32] 


Chapters 
5, 6, 7, 
see also 
Figures 
1.2, 6.1, 
74 
[Haa10, 
Waln02] 


[JaMROt, 
Gro01] 


Section 
1.4 
[Sha49, 
AlGr01] 


Chapters 
1,2,5 
[Kol77, 
Wil91, 
Nev65] 


Chapter 7, 
Figures 
7.7,7.14 
[JaMRO1, 
StNg96] 
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1981 
Jean Morlet 
petroleum engineer 


1984 

Alex Grossman 
mathematics, 
quantum physics 


1985 

Yves Meyer 
mathematics, 
applications 


1989 

Albert Cohen 
mathematics (ortho- 
gonality relations), 
numerical analysis 


1986 

Stéphane Mallat 
mathematics, signal 
and image processing 


Suggested the term “ondelettes.” J.M. 
decomposed reflected seismic signals 
into sums of “wavelets (Fr.: ondelettes) 
of constant shape,” i.e., a decomposi- 
tion of signals into wavelet shapes, se- 
lected from a library of such shapes 
(now called wavelet series). Received 
somewhat late recognition for his work. 
Due to contributions by A. Grossman 
and Y. Meyer, Morlet’s discoveries have 
now come to play a central role in the 
theory. 


Pioneered the theory of coherent 
(quantum-mechanical) states, precur- 
sors for dual wavelet systems. Mentor 
for, and coauthor with, J. Morlet and 
I. Daubechies. 


Mentor for A. Cohen, S. Mallat, and 
other of the wavelet pioneers, Y.M. dis- 
covered infinitely often differentiable 
wavelets. 


Discovered the use of wavelet filters in 
the analysis of wavelets—the so-called 
Cohen condition for orthogonality. 


While still a graduate student in en- 
gineering, working on vision, and on 
the Litthewood—Paley octaves, formal- 
ized and discovered what is now known 
as the subdivision, and multiresolution 
method, as well as the subdivision wave- 
let algorithms. This allowed the effec- 
tive use of operators in the Hilbert space 
L7(R), and of the parallel computational 
use of recursive matrix algorithms. 


Section 
7.2 
[JaMRO1, 
GrMo84] 


Sections 
8.1-8.2 
[Dau92, 
MeCo97, 
GrMo84] 


(JaMRO1, 
Mey89, 
MeC097, 
Mey97] 


Chapters 
5,6 
[Coh90, 
Dau92, 
Law9 la, 
Law9 1b] 


Chapters 
5, 6,7 
{[Dau92, 
Mal89, 
Mal98] 


1987 

Ingrid Daubechies 
mathematics, physics, 
and communications 


1991 

Wayne Lawton 
mathematics 

(the wavelet 
transfer operator) 


1999 

Richard Gundy 

use of probability in 
the analysis of filters 


1992 

The FBI 

using wavelet algo- 
rithms in digitizing and 
compressing 
fingerprints 


2000 

The International 
Standards 
Organization 


1994 

David Donoho 
statistics, 
mathematics 


List of names and discoveries 


Discovered differentiable wavelets, with 
the number of derivatives roughly half 
the length of the support interval. Fur- 
ther found polynomial algorithmic for 
their construction (with coauthor Jeff 
Lagarias; joint spectral radius formulas). 


Discovered the use of a transfer operator 
in the analysis of wavelets: orthogonality 
and smoothness. 


Discovered the use of martingales and of 
Kolmogorov’s 0-1 law for the rigorous 
analysis of wavelet filters in the analysis 
of wavelets—refined and corrected the 
so-called Cohen condition, orthogonal- 
ity condition for wavelet filters. 


C. Brislawn and his group at Los Alamos 
created the theory and the codes which 
allowed the compression of the enor- 
mous FBI fingerprint file, creating A/D, 
a new database of fingerprints. 


A wavelet-based picture compression 
standard, called JPEG 2000, for digital 
encoding of images. 


Pioneered the use of wavelet bases and 
tools from statistics to “denoise” images 
and signals. 
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Chapters 
1, 7, 

see also 
Figures 
7.16, 7.5 
(Dau92] 


Chapters 
8,9 
{[Law9 la, 
Law9 1b, 
Dau92] 


Chapters 
1, 5,6 
[Gun99, 
Gun04] 


Chapter 7 
[Bri95, 
BrRo91, 
JaMRO01} 


Chapters 3, 
7, especially 
the figures 
[JaMRO1] 


Chapter 7 
[DoJo94, 
JaMRO01] 
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General theory 


We consider a measurable space X with an endomorphism o, mapping onto X, such 
that the inverse image of every point is of the same finite cardinality. The branches of 
the inverse are assigned probabilities by some positive function W, and we study the 
corresponding transition operator, also called the Perron—Frobenius—Ruelle operator 
R = Ry. By iteration, the branches determine a tree, and we study an associated 
random walk on this tree, and the transition measures indexed by the points in_X. 

While there is a considerable literature on this setup already, the various papers 
place some kind of regularity condition on W, or on the system, (X, a), and the 
branches of the inverse. Our setting, assumptions and conclusions are in the mea- 
surable category. In contrast, when X is assumed to have a differentiable structure, 
and W is assumed Lipschitz, then, by Ruelle’s theorem, there are solutions to the 
equations vR = v, Rh = h, and v(h) = 1, where v is a Borel probability measure 
on_X, and / is a non-negative measurable function on_X. 

While the measure v may not exist in the general measurable setting, we develop 
formulas for solutions to the equation Rh = h, the so-called R-harmonic func- 
tions, and we give applications to the case when R is the wavelet transition operator 
defined from some measurable low-pass filter. In that setting, there are existence 
questions, and we show that the random-walk properties determine a variety of no- 
tions of wavelet orthogonality properties. 


A word about the graphics and the illustrations 


We owe much to the professional skill of Brian Treadway in programming and cre- 
ating the computer generated renditions of the numerous figures in this book. They 
are an essential part of the exposition. A list of our numbered figures, giving their 
captions and the page numbers where they appear, follows this “Getting started” 
section. This list includes the text for our figure captions. Readers are encouraged 
to preview the figures themselves to appreciate our use of trees, graphs, program- 
ming diagrams, subband-filtering schemes from engineering (signal/image process- 
ing), and a number of other visual tools and presentations. The author has himself 
found the figures, the graphs, and visualization of the algorithms exceptionally help- 
ful in learning, teaching and discovering some of the material in this book; but more 
importantly, also in understanding the deeper connections in the subject. 

The Mathematica program used in the production of Figure 7.5 (pp. 120-121) 
by Brian Treadway is given in full in the narrative in the “References and remarks” 
section at the end of Chapter 7 (pp. 152-153). The two figures 7.4—7.5 (pp. 118~121, 
each figure spanning two pages) serve to illustrate a particular aspect of the so-called 
pyramid algorithm for wavelets, i.e., the recursive algorithm which is used in among 
other things in the creation of wavelet packets; see, e.g., Figure 7.3 (p. 113). For this 
purpose, the algorithm is used in the two figures with two different initializations. 
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The first is the simpler one, the Haar scaling function (see Figure 7.4, pp. 118-119), 
and it makes the dyadic subdivision especially transparent. The second initializes 
with Daubechies’ scaling function (see Figures 7.5 and 7.16, pp. 120-121 and 134). 
The reason for the name “wavelet packet” (which is the subject of Chapter 7) is espe- 
cially transparent in Figure 7.5. Notice especially two aspects of the progression of 
functions in the sequence of graphs inside the respective figures: In moving from one 
graph to the next, the numerical frequency appears to increase with each subdivision- 
step. But in addition, an enveloping shadow emerges in the progression through the 
graphs, and a shape of a wave with a lower frequency (i.e., longer wavelength) ap- 
pears to “capture” and group the functions themselves into packets, much like the en- 
veloping “beats” from music composition. We mention these figures already now, as 
the geometry of the steps that go into the diverse recursive algorithms are especially 
transparent to the naked eye in Figures 7.4—7.5. See also the sequence of figures 
7.6—7.13 (pp. 123-128) for the programming aspects of the same idea. Specifically, 
Figure 7.13 stresses the matrix steps that are indicated by the recursions. 

We further stress that these figures play a central role in our presentation in this 
book: Some figures illustrate the kind of self-similarity in time and in space coordi- 
nates that is typical both in fractal analysis, and in the study of wavelets; while others 
illustrate decision trees; and yet others make clear the kinds of arrow-flow diagrams 
which are popular in building of recursive algorithms, and in programming more 
generally. See, e.g., [Knu84] as well as Knuth’s monumental volumes [Knu81]. We 
mention Knuth’s article [Knu84] as it emphasizes in a special case both the stochastic 
and the algorithmic features of the fundamental subdivision/filter algorithm. 

For explanation of the cover figure, see “About the cover figure” at the front of 
the book (p. vi). 


Special features of the book 


A main aim of this book is to show how these ideas have proved fruitful in both 
the study of iterated function systems (IFS), see Chapter 4, and of wavelets, see 
Chapter 7. While the pyramid algorithm (in its diverse incarnations) is now typi- 
cally identified with wavelets, it in fact has a long history in engineering, informa- 
tion theory [Sha49], and symbolic dynamics. The connection to signal and image 
processing from engineering was emphasized in our survey article [Jor03]; but see 
also [DoMSS03, Mal98, StNg96] for much more detail. The connections to sym- 
bolic dynamics are manifold, but we wish to especially recommend the beautiful and 
inspiring invitation [Rad99] addressed to students, and authored by Charles Radin. 
The book contains four separate items which we hope will help readers recon- 
cile current terminology used simultaneously in mathematics and in applications 
(especially in signal-processing engineering, and in physics—notably in optics!) 
These four elements are as follows: (i) a system of interrelated appendices (pp. 205— 
222), (ii) a list of comments for mathematicians on _ signal/image 
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processing jargon (“Afterword,” pp. 223-230), and (iii) a list of names (with his- 
torical comments) of mathematicians and scientists, both past and current era math- 
ematicians/engineers/scientists who made pioneering contributions to the main ideas 
presented in the book (pp. xxx—xxxiii above). 

Finally, in item (iv) we include a glossary consisting of a list of terms occurring 
in the book in varied contexts of mathematics, probability, engineering, and on oc- 
casion physics (pp. xvii-xxv above). To reduce the apparent confusion created by 
the same concept having up to four different names, the glossary includes informal 
explanations spelling out the reasons behind the differences in current terminology 
from neighboring fields. 

In item (i) we attempt to translate the various engineering terms and constructs 
into mathematical formulas. This is a continuation of a theme we started in an AMS 
Notices article [Jor03}. For (iii), we apologize for the subjective nature of our com- 
ments; and we readily acknowledge the difficulties in writing history of events that 
are rather modern on a mathematical scale. All three concluding additions to the 
book are motivated by the nature of the subject at hand, specifically by the many 
connections between ideas from signal/image processing and the more mathematical 
themes from wavelets, fractals, and dynamics. 


Exercises: Overview 


The exercises are essential, and they serve several pedagogical purposes: They are 
there to help students and users practice the fundamental concepts in the book, and 
to test his/her hand at computations, and at sketching functions or iterative schemes. 
But they also serve to expand horizons! We hope students will acquire a hands- 
on feeling for the various basis functions already discussed in the book; and more 
importantly, get started at building bases of his/her own. Some readers might even 
want to begin with the exercises, and then read the text as they work along in the 
exercises. Indeed, in designing the exercises, I have been inspired, at least in part by 
books with a philosophy like this: “Learn Hilbert space by doing problems!” e.g., 
[Hal67]. Or: “Operator algebras by example!” e.g., [Dav96]. Finally, several of our 
exercises have been put in to expand on themes inside the text, pointing the reader 
to new developments in the subject, and especially stressing links to neighboring 
subjects and to applications; pointing to connections between modern trends and 
classical subjects, highlighting an especially powerful (and beautiful) idea, or even 
suggesting unorthodox applications. 

One exercise in Chapter 7 is different from the others; it is a multipart exercise 
involving a search on the Internet. The student is asked to browse images (from 
image-processing engineering) on the web, and to follow visually how a certain (ge- 
ometric and algorithmic) cascading process is made concrete. The mathematics of 
the cascade is worked out in Chapter 7 itself. 
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The exercises vary, with variation both within a given chapter, and variation from 
chapter to chapter. The longer chapters are supported by proportionally more exer- 
cises. Some of the exercises are short, and some are long. Some are conceptual, and 
others are computational. Some are hard, and a few are relatively easy. The latter 
are meant mainly for the student to practice the use of one definition or another. Yet 
others serve to extend material covered in the book, and again others are there to help 
the student review background. Then there is a sliding scale up to the more difficult 
exercises, but the reader will then find hints. Finally, some exercises are built up in 
steps as multipart exercises with many parts. So what is a single numbered exercise, 
for example Exercise 7.9, may have many parts, (a), (b), etc., and they progress in 
logical steps, often with one part building on the previous. In all, the exercises make 
up more than 40 pages. 


We have presented the material so that different readers can select the parts of it 
that are closest to his/her own interests; and in particular, it is not necessary to begin 
with Chapter 1. In fact, for some it might be better to begin with the Appendices, 
or with the Afterword containing the special sections “Comments on signal/image 
processing terminology” and “Computational mathematics,” or for some insight into 
the history of the subject, the “List of names and discoveries” on pages xxx—xxxiii 
above. While some chapters build on earlier ones, non-sequential reading is possible, 
especially with the use of two extensive index lists, an index of “Symbols” and a 
general “Index,” which are the last items in the book. The following quote may be 
appropriate for interdisciplinary mathematics books. 


Det er ganske sandt, hvad Philosophien siger, at Livet maa forstaaes 
bagleends. Men derover glemmer man den anden Scetning, at det maa leves 
Jorlends. Hvilken Setning, jo meer den gjennemtcenkes, netop ender med, 
at Livet i Timeligheden aldrig ret bliver forstaaeligt, netop fordi jeg intet 
Dieblik kan faae fuldelig Ro til at indtage Stillingen: baglends.' 


—Soren Kierkegaard 


Iowa City, 
June 2006 Palle E. T: Jorgensen 


1 «Tt is really true what philosophy tells us, that life must be understood backwards. But 
with this, one forgets the second proposition, that it must be lived forwards. A proposition 
which, the more it is subjected to careful thought, the more it ends up concluding precisely 
that life at any given moment cannot really ever be fully understood; exactly because there 
is no single moment where time stops completely in order for me to take position [to do 
this]: going backwards.” Often shortened to “Life can only be understood backwards; but 
it must be lived forwards” (Livet skal forstaas baglens, men leves forlens). 
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Le plus court chemin entre deux vérités dans le domaine réel passe par le 
domaine complexe. —Jacques Hadamard 


1.1. N = 2, simple dyadic branch points; a function W:.X¥ — [0,1] 
is given. A path in the random-walk model. Assignment of 
probability to a particular length-five path: pe) ({(01101))= 
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1.2 Haar wavelets: The two functions g and y are the father, resp., the 
mother functions for Haar’s construction in cases (a) on the left, and 
the stretched Haar (b), on the right. In each case, we use y to create 
a double-indexed basis system (1.3.16). For (a) this system will be 
an orthonormal basis (ONB) in L?(R); but for (b), the functions 
(1.3.16) will only represent a Parseval frame, i.e., a function system 
which yields the Parseval identity (1.3.22) for every function in 
L?(R). As is apparent from (b), this Parseval system will not be an 
ONB for the stretched version. We resume the discussion of this in 
the text following Figure 6.1 (p. 103). ............. 0. ee ee eee eee 

1.3. The slanted matrices F and G of (1.7.2). 2.0... . ccc cee es 
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X4, the conjugate or left-handed quarter Cantor set. ...............005 75 
Alternate limiting approach to the conjugate quarter Cantor set X4...... 76 
The 2-adic fractions. .... 2... 61. cece eee eee eee e ee 89 
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Haar’s wavelets (compare Figure 1.2, p. 13): Haar’s two functions g 
(6.2.3) and y (6.2.4) are illustrated in the two columns: The system 
in the column on the left in the figure illustrates Haar’s orthonormal 
(ONB) wavelet basis, and the scaling identities are visually 
apparent. The system of functions 93 (6.2.5) and y3 (6.2.6), for 
p = 3, in the right-hand column illustrates the two stretched Haar 
functions. The two scaling identities (6.2.1) and (6.2.2) are visually 
apparent for both function systems, on the left and on the right. 
Moreover, for both systems, the yw function yields a double-indexed 
Parseval basis (see (1.3.16)), ie., a function system for which the 
Parseval identity (6.2.12) holds for every function f in L?(R). If 
p = | (the left-hand-side case), then the inequality in (6.2.11) is an 
equality, i.e., is an =; while for p = 3 (the right-hand-side column 
in the figure), the term on the left in (6.2.11) is strictly smaller than 
If 32 R for non-zero f. The reason for this is the overlap for the 
Z-translated functions, which is graphically illustrated too.......... 103 
Solutions to the eigenvalue problem (6.2.19). .............. ee eee eee 107 
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Subband filtering for dyadic wavelets: mp = low-pass filter, 
m = high-pass filter, © down-sampling, @ up-sampling. The 
figure refers to input (from left) and output (to the right) of signals 
from standard signal processing; for example, of speech signals. 
From the left, we start with (signal) input. The signal is then split 
into two frequency bands with two filters mo and m,. The first filter, 
mo, passes low frequencies, and the second, m1, passes high. The 
split into frequency bands is called “analysis.” We get two signal 
strands which are now each followed by signal down-sampling. As 
we move right, the two bands are then up-sampled. The processed 
signal bands then pass dual filters, and are finally merged again with 
+ into an output signal. “Perfect reconstruction” means that the 
output signal to the right is matched up perfectly with the input from 
the left. From engineering, we know that it is possible to find filters 
mo and my that achieve perfect reconstruction; i.e., recovering the 
input signal by synthesis of the bands. Magically, and by hindsight, 
these filter systems serve at the same time to give us orthogonal 
wavelets, subject to a technical condition discussed in Chapter 5. 
(There can be more than two frequency bands, and we refer to 
Section 7.6 for the general case.)............ 0.0. cece eee eee eee 
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C.1 Representation of S = TU*. This and the next diagram illustrate 
the use of the operator relations from Appendix C in synthesis of 
subbands of a signal, or of a subdivision of an image. The two 
diagrams form a pair. The present first one represents a matrix 
action transforming one band-configuration into another. This is 
followed by up-sampling applied to each strand. There is then a 
system of translations before the strands are merged and added. This 
diagram is a natural dual version of the next one, Figure C.2, i.e., 
the analysis step which begins with subdivision, or breaking apart 
an input'signal: : oc. sae do detiale ceed wetted abdn lees dees 214 

C.2 Representation of S* = UT*. The present two figures together, 
Figures C.1 and C.2, represent equivalent formulations of the 
respective left-hand and right-hand sides in Figure 7.7 (p. 124). But 
the present figures in fact represent signal-processing algorithms 
(or equivalently wavelet algorithms) that are more versatile than 
the more traditional polyphase matrices: specifically, the matrix 
functions in our present diagrams may be arbitrary, and be of 
arbitrary size. The size of a polyphase matrix equals the number 
of subbands which is used. But we also allow non-unitary matrix 
functions. Of course, more general choices of matrices in Figures 
C.1-C.2 will then correspond to more general choices of filters in 
Figure 7.7. These choices are highly relevant, for example for the 
algorithms based on lifting, see [DaSw98]. Our present Figures 
C.1-C.2 are multiband versions of diagrammatic representations of 
subsampling/subband-filtering algorithms which appear frequently 
in the signal-processing engineering literature; see for example 
[WWW 1] with text, and [WWW2] and [WWW3] with pictures. .... 215 
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The image collage on the next page illustrates central themes in the book: 
The three oscillatory graphs placed down diagonally in the collage repre- 
sent the kind of functions generated algorithmically by tree-like, or so-called 
pyramid, algorithms. These recursive algorithms are structured around 
pyramid shapes as in the top right corner of the collage, and they come 
Jrom successive repetitions of dyadic branching steps from the sketch in the 
lower left corner of the collage. See also “About the cover figure” (page vi). 
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Introduction: Measures on path space 


We believe that all curious people can enjoy and understand great mathe- 
matical ideas without having to brush up on garden-variety school math or 
relive their painful algebra daze. | —Edward B. Burger, Michael Starbird 


PREREQUISITES: A sense of curiosity! Algebra of the integers and the real num- 
bers; rudimentary facts about functions and measures; limits; compactness; Cantor; 
matrices; inner-product spaces. 

Question: “What if I don’t have the prerequisites for reading the prerequisites?” 
A: “Doesn’t matter! Skip Chapter 1!” 


Prelude 


In the first section of this chapter, we introduce the fundamental concepts of wavelet 
functions and wavelet constructions, but in a form which stresses scale-similarity. 
This scale-similarity is a special case of a more general notion of self-similarity. 
We will later see that the more general idea of self-similarity is needed for our un- 
derstanding of fractals and symbolic dynamics. 

One of the difficulties in moving from the traditional setting of wavelets to the 
more general iterative models is the distinction between linear and non-linear: Recall 
that wavelets are built in the /inear space R?, which in turn comes equipped with its 
canonical d-dimensional Lebesgue measure. By contrast, fractals are typically non- 
linear structures, not even groups; and one of the first issues confronting us will be 
that of finding a substitute for Lebesgue measure. As it turns out, such substitute mea- 
sures exist, having certain intrinsic properties quite analogous to those of Lebesgue 
measure; but the new measures depend on the particular fractal X under discussion. 
Once a particular X is specified (by a finite system of affine mappings), we will 
see that it acquires a natural and intrinsic fractal dimension (called the Hausdorff 
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dimension, say s), and a corresponding fractal measure called the Hausdorff mea- 
sure. However, note that there is an s-dependence. Specifically, we have the existence 
of a Hausdorff measure 5 on X for each value of s. And for the traditional case of 
IR7, we have s = d. Later we shall outline iterative schemes which yield jus. 

Further into the book, we will then extend both Fourier tools and wavelets to 
these fractals. The class of fractals we consider is somewhat restricted. But these are 
the fractal classes where a reasonable harmonic analysis is feasible. 

To make the ideas more concrete, we concentrate initially on simple and very 
explicit fractals such as the middle-third Cantor set, and other Cantor sets on the line. 
But the various more general points and constructions will only emerge gradually as 
the subjects in the book progress in a natural order. Hence, Section 1.1 will include 
also a brief chapter-by-chapter outline of the central ideas in the book, stressing 
throughout how our random-walk model is used. 

To help highlight wavelets, fractals, and signals, throughout the book, the nar- 
rative is illustrated with figures. Each figure is there to help students visualize main 
ideas, and many of the figures are created from recursive Mathematica programs, by 
Brian Treadway. 

And in addition to the figures inside the text, each chapter contains little graphical 
ornaments or “dingbats.” They should help readers navigate the book, and they also 
serve as reminders of the three themes. Some dingbats mark separation between the 
prelude sections and the rest of the chapter, while others are placed near the end of 
each chapter to mark the start of the exercises. 

A closer inspection reveals that the dingbats that follow the chapter preludes are 
all different. Some dingbats represent certain selections from families of images. 
Within each family though, the dingbat images are different. 
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1.1 Wavelets 


Wavelets are families of functions of one or several real variables, i.e., functions on 
R or on R24, satisfying three properties: 


(i) They form a basis for L?(R) (or L?(IR2) in the several-variable case) and have 
suitable orthogonality properties which we spell out below. 

(ii) They are indexed by integer translations, i.e., by the operation x — x + k, for 
k e Z, and by integer powers of a scaling. Often the scaling is dyadic, i.e., 
scaling is by powers 2/, for j € Z. 
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(iii) There are two generating functions g and y (called father and mother functions) 
for the whole double-indexed wavelet which relate the two operations of scaling 
and translation in the following way: 


Q= 2 tHe --k, y= 2: Oxg(2 - —k), 


where the coefficients P; and Q, are carefully chosen. 


Comments on (iii). The choice of the coefficients is essential, and it has both 
an algebraic and a geometric flavor. We will see that various admissible classes for 
the coefficient systems {P;, O;} may be thought of as quadratic varieties, and there 
are solutions both in finite and infinite dimensions. The two relations in (iii) above 
represent only the dyadic case of what is called scaling relations. The first is an 
equation for a function g, and is called the dyadic scaling equation. The function ¢ is 
called a scaling function. The context and the ambient function space dictate choices 
for the P;’s, which in turn yield non-trivial solutions g. Moreover, the possibilities for 
solutions y are extremely sensitive to these choices, and we shall develop algorithms 
and discuss stability for solutions g and y. 

The coefficients P; are called masking coefficients, and conditions on these co- 
efficients must always be imposed. This is a crucial point which will be taken up in 
Section 1.3 below, and in more detail again within the book itself. 


The context for solutions: Our context will include both standard Lebesgue 
measure and some classes of fractal measures as well. Specifically, we shall consider 
higher dimensions, such as R¢, and more general scaling operations (e.g., matrix 
scaling). We will further introduce a variety of fractal structures which come with 
associated scaling relations. But in (iii) we have listed the simplest scaling relations 
only to illustrate the general idea. Special cases of classes of conditions for the coef- 
ficient systems {P;, O;} are known as quadrature conditions, or guadrature-mirror 
conditions. The ambient function space might be simply the familiar Hilbert space 
L? (R), or it may be a Hilbert space built on some affine fractal, which in turn is 
designed from a matrix scaling and an iterated function system of affine maps in R@ 
(definitions in Section 1.3 and Chapter 4). However, depending on the context, the 
admissibility conditions for the coefficients P; are different. 

The various problems connected with classes of scaling relations have tradition- 
ally been addressed with only standard tools from analysis. Here we will instead 
employ a suitable mix of analysis and path-space methods from probability. 

In these problems, solutions are constructed from an algorithm which relates a 
certain function g(t), t © R, called a scaling function, to its scaled version g(Nt) 
where JN is a fixed integer, 2 or more, or an expansive integral d-by-d matrix A if t 
is in R@. 

For d = 1, the formula which determines g is (1.3.1) below. It is called the 
scaling identity. (Note that the first relation in (iii) is a special case of (1.3.1).) It 
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is not at all clear a priori that this identity should have L* solutions at all, or even 
solutions that are given by some kind of convergent algorithm. The coefficients in 
(1.3.1) are called masking coefficients, filters, or filter coefficients. For special values 
of the coefficients, it turns out that there is a normalized L? solution g, and that 
pointwise convergence of a good approximation can be established in a meaningful 
way. Moreover, we show that the convergence behavior is dictated by properties of 
an associated transfer operator R, or Ruelle operator. 

The expression on the right-hand side in equation (1.3.1) is also called a subdivi- 
sion because of the function values g(Nt — 4). An iteration of the operations on the 
right-hand side in (1.3.1) on some initial function is called a cascade approximation. 
This approximation is one approach to the function g, and the other is the infinite 
product formula (1.3.5) for the Fourier transform of g. 

The relation between g and its scaled refinement is well understood when we 
pass from the time domain to the frequency domain via the Fourier transform. In that 
case the relation is multiplicative, and involves a certain periodic matrix function m, 
called the low-pass filter. For further details, see formula (1.3.4) below. 

The study of the filters m is part of signal processing (see, e.g., [Jor03, StNg96]). 
But by a “mathematical miracle,” they have become one of the most useful tools in 
wavelet constructions; and at the same time, they have pointed to a host of exciting 
applications of wavelet mathematics. To get a path-space measure for some of the 
wavelet problems, we use the quadrature-mirror properties, or their generalizations, 
which are assumed for the filter function m, also called a frequency response func- 
tion. It is periodic, in one or several variables. In IR, there is a variety of choices of 
a period lattice for the problems at hand. 

Since the first question is to decide when @ is in L”, we iterate and get an infinite 
matrix product involving the matrix W := m*m., Since W is positive semidefinite, 
we may create a positive path measure of a random walk starting at x in some pe- 
riod interval. In several dimensions, x starts in a fundamental domain D for some 
fixed lattice, for example Z?. The paths starting at x arise by iteration of the inverse 
branches of x —» Ax mod Z%. There are N = |det A| distinct branches. These N 
branches may be viewed as endomorphisms of D. 

We construct our random walks in a general framework which includes both 
wavelets, wavelet packets (see Chapter 7), and some of the other more classical 
problems. We further show how some of the classical questions may be phrased 
and solved with the use of path-space measures. 

It is interesting to contrast our proposed approach with the more traditional one 
used in wavelet analysis: see, e.g., [Dau92]. Traditionally, some kind of Lipschitz 
or Dini regularity condition must be assumed for the filter. Then the corresponding 
infinite product may be made precise, and we can turn to the question of when the 
wavelet generators are in L?(R?). As it turns out, both of these issues have natural 
formulations and solutions in terms of the path-space measures. The results allow 
a wider generality. In a variety of wavelet questions for band-limited wavelets, it is 
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just an unavoidable fact that the regularity conditions are not satisfied for the filters 
m that are dictated by the setting and the applications. 

Before turning to the details, we sketch a brief outline. We resume the summary 
in more detail in Section 1.6 below. The remaining sections in this chapter contain 
more definitions, motivation and examples, which will be used in the overview at the 
conclusion of Chapter I. 

In Section 1.2, we introduce the basic concept of a probability space built on 
covering mappings with a fixed number of branches. These systems give rise to ran- 
dom walks on trees, paths, and transition probabilities. We discuss these notions in 
Section 1.2. 

In Section 1.3, we illustrate the relevance of the path-space measures to some 
questions in wavelets, and in Section 1.4, to sampling theory. To make the questions 
more concrete, we outline four examples at the end of Section 1.3, two variants of 
Haar’s wavelet construction, Daubechies’ wavelet, and finally we sketch an analo- 
gous harmonic analysis question for the middle-third Cantor set. 

In Section 1.5, we prove a theorem on pointwise convergence of a class of infinite 
products. Our theorem uses the transition probabilities which are associated with the 
random walks that we introduce in Section 1.2 below. 

Chapter 2 develops general theory, transition probabilities, branching processes, 
and corresponding probability spaces. This material has a more technical flavor. 

In Chapter 3 we specialize the measures, constructed in a general framework in 
Chapter 2, to the concrete context of wavelets in the Hilbert space Z? (IR). In Chapter 
2, we introduced transition probabilities P, on a probability space Q; and in the 
wavelet application, we outline how the group of integers Z is naturally embedded in 
Q. And we show that the orthogonality issue for wavelets is equivalent to Z having 
full measure in Q. 

In Chapter 4 we turn to a class of affine fractals, iterated function systems (IFS) 
by affine maps, but with focus on certain Cantor sets, mainly the middle-quarter 
Cantor set. For the affine IFSs we show that a natural basis question is equivalent to 
the natural numbers No having full measure in Q. For this formulation we outline a 
conjugate system of Cantor sets. 

Chapter 5 addresses the general case of measurable systems constructed from a 
given N-to-one mapping. In this general context, we give a sufficient condition for 
Z having full measure in Q. 

In Chapter 6 we resume the discussion of wavelets, but in a non-commutative 
setting motivated by multiwavelets. 

In Chapters 7 and 8 we turn to two measure-theoretic issues from the general 
framework of Chapter 5: There is a class of representations of two familiar C*- 
algebras which produce IFSs in the measure category, the Cuntz algebra, and the 
algebra of the canonical anticommutation relations. Among other things, we show 
that the determinantal measures studied by Russell Lyons and others in combinatorial 
probability theory are special cases of this. 
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Chapter 8 contains some examples of wavelet packets and permutative repre- 
sentations. We give a systematic presentation of this subject, and we show that the 
concrete examples fit within a general measure-theoretic framework. While the ex- 
amples have been studied in research papers, they have so far not found a unified 
formulation. In Chapter 9 we show that the underlying geometric questions for these 
examples may be phrased in the context of operators in Hilbert space. 

Our selection in Chapters 8-9 and in the appendices includes certain geometric 
topics from operator algebras and Hilbert space. They have been carefully selected 
with a view to overlapping with signal and image processing, and keeping in mind 
the kind of scale-similarity relations that are used in the analysis of wavelets, fractals, 
and discrete dynamics. For example, we stress how problems in “fractal noise,” in 
images, and more generally in higher dimension may be unified with the use of 
operators, tensor products of Hilbert space, and states on C*-algebras. 

A case in point is the serendipity involved in the observation (see [Jor03]) to the 
effect that representations of the so-called Cuntz relations and Cuntz algebras from 
C*-algebra theory have been extensively used in the early days of signal processing 
(before they were “discovered” in C*-algebras! ), and that they continue to be relevant 
to the kind of recursive algorithms still used to this day in wireless communication 
and in image processing. 


1.2 Path space 


A well-tested tool in analysis and mathematical physics centers around the applica- 
tion of path-space measures. This tool is used in attacking a variety of singular con- 
vergence, or approximation, problems. We will adopt this viewpoint in our study of 
wavelet approximations. Traditionally, the setting for wavelet questions has included 
assumptions concerning continuity, or some kind of differentiability. In contrast, we 
shall work almost entirely in the measurable category. One advantage of this ap- 
proach is that we stay in the measurable category when addressing problems from 
multiresolution analysis (MRA). Earlier work on the use of probability in wavelets 
includes that of R.F. Gundy et al. 

Our present viewpoint is more general than [DoGHO00], and it starts with the 
random walks naturally associated with a measure space (X, B) and a given mea- 
surable onto map o:.X — X such that #o~! ({x}) = N for all x © X, where N, 
2<N < oo, is fixed. Iteration of the branches 


o' (ix) = (ye Xo) =x} (1.2.1) 
then yields a combinatorial tree. If 
@ = (w),@2,...)€Q:={0,1,...,N—I¥, 


an associated path may be thought of as an infinite extension of the finite walks 
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starting at x, where (t;), i =0,1,..., MN — 1, is a system of inverses, i.e., where 
oot =idy, O<i <N. (1.2.2) 


We now turn to the path-space measures P, alluded to above, but for now only in 
a special case. More general cases emerge later on, especially in Chapters 2, 5, and 7. 
These will be different applications, and Chapters 7 and 9 will even be in a non-com- 
mutative setting. The definition of P, which we give below is quite standard, but it 
may not look simple or natural to someone unfamiliar with infinite products, or with 
the present viewpoint. This should not be alarming. We believe that the fundamentals 
of P, will allow the reader to already now see a number of quite familiar and basic 
wavelet constructions in a new light. This will be useful later, both for wavelets 
and for a class of similar fractal constructions given in Chapter 4, as well as for 
frequency-localized multiwavelets in Chapter 7. Each measure P, depends on the 
initial point x and on a prescribed and fixed weight function W on X. 

We shall think of P, as a measure on (infinite) paths originating at x, but its 
construction rests on a fundamental idea of Kolmogorov which states that P, is de- 
termined by its value on paths that are only fixed at a finite number n of places. It 
is prescribed for each n, and it extends to all paths, i.e., to Q, if and only if there is 
consistency from n to n + | for all n. This consistency condition is defined from the 
given function W, in that W assigns the transition probabilities; see Lemma 2.5.1 
(Kolmogorov consistency, p. 46) for precise details. Our probability space Q will 
be the space of all infinite words w, Q will be given its standard Tychonoff infinite- 
product topology, and C (Q) will denote the space of all continuous complex-valued 
functions on 22. 

If W: X — [0,1] is a given measurable function such that 


> ¥0)=1, (1.2.3) 
yio(y)=x 


then an associated measure P, on Q may be defined as follows. Suppose some func- 
tion f € C (Q) depends only on a finite number of coordinates, say @), ..., @»; then 


set 
i fadPy = »: f (1, -- +5 @n) W (ta, X) W (Toop T,X) +++ W (Coop + To) « 
2 (@1,.-.,0n) 


(1.2.4) 

Extensions of this formula to Q can be done in a number of ways: see Chapter 2, 
Figure 1.1, and the cited references. 

We shall need the concept of random walk in a slightly restricted context. Nor- 

mally, the concept of random walk is confined to a lattice, i.e., Z4 for some d. The 

qualitative features of such a walk depend both on the lattice dimension d and on 
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Fig. 1.1. N = 2, simple dyadic branch points; a function W:X — [0,1] is given. A 
path in the random-walk model. Assignment of probability to a particular length-five path: 


PE) (0110.19) = W (cox) W(c1t0x) W(c?e0x) W (cor?20x) W (e702? 202). 


suitable transition probabilities. In our present context, we shall use a combinato- 
rial tree (see Figures 1.1 and 2.1 (the Farey tree, p. 42)) in place of a lattice, and 
we shall use a given function W to prescribe transition probabilities. But the vari- 
ous paths within our tree can originate at points chosen from the set X. Here X is 
given (typically compact), and X carries a fixed finite-to-one endomorphism o. Our 
construction is easier to visualize when a finite set of branches of the “inverse” for 
o is assigned in concrete terms: hence the maps 1,,. If we assume property (1.2.3) 
for the function W, and if x and y are points in X such that o (y) = x, then the 
number W (y) from (1.2.3) represents the probability of a transition from x to y. 
Figure 1.1 above illustrates how step-by-step conditional probabilities are then used 
in assigning probabilities to a finite path which originates at some chosen point in x. 

What if the path is infinite? We shall see in Chapter 2 (Sections 2.4 and 2.5) 
how a fundamental idea of Kolmogorov then allows us to assign probabilities also to 
infinite paths. 

In Section 1.1 above we already used this idea for the case when_X is the circle, 
also called the one-torus, T. For each NV, we may then consider o (z) := 2. And in 
the context of wavelet constructions, we already noted how to concretely parametrize 
the branches of the inverse of z” when z is complex and restricted to T. 

A special feature of this construction, which will be explored in this work, is that 
of attractive convergence properties for infinite products of the form 


LT] (core) ++ W (tm +++ to) (1.2.5) 
N,Q),...,0n 


over certain subsets of Q. As it turns out, these infinite products are determined by 
the measures (P;),<y, and by the Ruelle transition operator 
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(Rvg@)= DI WO)s0), geLl*(X). (1.2.6) 
yo (y)=x 


The operator R in (1.2.6) is called the transition operator, the Ruelle operator, or 
the Ruelle—Perron—Frobenius operator, and it will play a major role in what follows. 

Many problems in dynamics are governed by transition probabilities W, and P,., 
and by an associated transition operator Ry as in (1.2.6). Wavelet theory is a case in 
point, and we show that fundamental convergence questions for wavelets and prop- 
erties of the solutions to (1.3.1) depend on the positive solutions / to the eigenvalue 
problem Ry (h) = h. We refer to (1.3.10){1.3.11) below. The solutions / are called 
harmonic, and the function / in (1.3.10) is a special harmonic function which we 
will study in detail later in Chapter 3. 

The nature of these solutions is a key to the link between the analysis and proba- 
bility of path space. Since the early days when use of measures on infinite path space 
was first suggested for problems in mathematical analysis, a key question was: what 
is the support of the measure under discussion? And our use of the W-measures P, 
for problems in wavelets and fractals is a case in point. 

In particular applications of this viewpoint, the measures Py, are initially defined 
naturally in connection with certain random-walk models. And each P,., for x € X, 
is then a Borel measure on a certain rather large and unwieldy probability space Q. 
But for computations it is essential to know that the measures are in fact supported 
on (or carried by) much smaller subsets of the initial space Q. These smaller subsets 
in Q can be made quite explicit, and they are closely related to dynamics and general 
classes of multiresolutions, as we proceed to describe. 


1.3 Multiresolutions 


All this time the Guard was looking at her, first through a telescope, then 
through a microscope, and then through an opera-glass. _—Lewis Carroll 


The notion of wavelet refers to a specific basis construction in a function space which 
is fixed at the outset and which carries an inner product. This construction is popu- 
lar, perhaps because it is so computationally attractive. It is based on a certain self- 
similarity (or scaling-similarity) which gives rise to a cascading or nested family of 
closed subspaces. These subspaces begin with a fixed resolution which is determined 
by a scaling equation such as (1.3.1) below; but as we will see, there are more general 
ways of identifying scaling-self-similarity. 

The idea is that there is a scaling transformation which lets us move in dis- 
crete steps, up and down an associated scale of subspaces, hence the concept of a 
multiresolution, a concept derived from optics; see, e.g., [JaMRO01], [Mal89], and the 
references given there. One end of the resolution scale refers to “coarse,” and the 
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other to “fine”’ Comparing two subspaces, the space of the coarse scale is contained 
in that of the fine resolution. The traditional wavelet multiresolutions are by now 
very nicely presented in the literature, and we especially recommend Daubechies’ 
book [Dau92]. 

As noted for example in [Mey89] and [Mey97], a fixed scaling transformation 
is also the feature which most efficiently introduces such tools as Hilbert space, op- 
erator theory, martingales, conditional expectations, and computational algorithms 
into wavelet analysis. Since we are in Hilbert space, closed subspaces correspond in 
a one-to-one fashion to (orthogonal) projections, and the notion of multiresolution 
may thus be rewritten, as we will see (Chapter 7) in the language of projections. Thus 
the multiresolution structure, understood this way, offers a telescope for looking at 
functions or at signals in digitized form, and it makes a crucial link to signal and 
image processing. (See the appendices for further details!) We will meet a number of 
incarnations (some special, and some rather general) of multiresolutions in several 
chapters throughout the book, especially in Chapters 7~9; but they will always be 
based on the geometry of scaling projections in Hilbert space. 

The multiresolution approach to wavelets involves functions on R. It begins with 
the fixed-point problem 


g@)=N>lag(Nt-k, teR, (1.3.1) 
keZ 


where a given sequence (ax)ze-z is chosen with special filtering properties, e.g., 
quadrature-mirror filters; see [BrJo02b, JorOla, DuJo06b]. The equation (1.3.1) is 
called the scaling identity, and the a;’s the response coefficients, or the masking 
coefficients. Introducing the Fourier series 


m (x) = > ape" (1.3.2) 
keZ 
and Fourier transform 
G(x) = I etx (1) dt, (1.3.3) 
R 
we get the relation 
A X\./X 
O(x)=m (=) @ (=) 5 x eR, (1.3.4) 


which suggests a closer inspection of the infinite products 
CO 
x 
I» (sa): (1.3.5) 


Since we want solutions g to (1.3.1) which are in L? (R), (1.3.4}(1.3.5) suggest the 
corresponding convergence questions for the function W := |m|*. 
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When (Px) [0,1] is the family of measures on Q corresponding to W = |m/?, 
then the formal infinite product 


let =I] (a) 


is Py ({(0, 0, 0, ...)}), i-e., the measure of the singleton (0, 0, ...) (an infinite string 
of zeroes) in {0,..., N — 1}. There is a natural way (based on Euclid’s algorithm) 
of embedding Z into 


Q={0,...,N—1}5 x {0,...,N — 14 (1.3.6) 
such that ; 
Py Z)= > 6 @ +H). (1.3.7) 
keZ 


Remark 1.3.1. Note that, in general, it is not at all clear that the measures (P,), 
x € X, on Q should even have atoms. Typically, they do not have them! But if 
atoms exist, i.e., when there are points @ € Q such that P, ({w}) > 0, we note that 
this yields convergence of an associated infinite product. Let No := {0,1,2,...} = 
N U {0}. Using Euclid, and the N-adic expansion 


k=iptinN+---+i,N"! — fork ENo, (1.3.8) 
we see that the points 


o(k) = (it, -.-, in, 0, 0, 0, ...) 
———— 
oo string of zeroes 
represent a copy of No sitting in Q. With the identification k <> w (k), we set 
oo 
P, (No) = >) Px fo ®)- (1.3.9) 
k=0 


But in general, the function P, (No) might be zero. Our first observation (Proposition 
5.3.2) about 
h(x) := Px (No), xeX, (1.3.10) 


is that it solves the eigenvalue problem 
Rwh=h. (1.3.11) 


We say that / is a minimal harmonic function relative to Rw. See Chapter 7 for the 
justification of the term “minimal”. Note that h = 0 may happen! 
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Remark 1.3.2. Let X, B, a, 1, ..., ty_-1, and W be as described above, and let 
(P,)xex be the corresponding transition probabilities. Let 
0=(0, 0, 0,...)eQ={0,1,...,N—1}%. 
oo string of zeroes 


While in general, often P, ({0}) = 0, the case when P, ({0}) > 0 is important. The 
condition P, ({0}) > 0 is a way of making precise sense of the infinite product 


P, ({0}) = I] W (tp x) . (1.3.12) 


n=1 


If, for example, lim,oo W (?x) < 1, then it is immediate from (1.3.12) that 
P, ({0}) = 0. 
Suppose P, ({0}) > 0. Then it follows that 


Py (Cit +05 in ODN) = W (tix) W (tig THX) Prigetrpx (OW. (1.3.13) 


Using (1.3.8), we shall identify 4 € No with the point w (k) € Q, and write P, ({k}) 
for the expression in (1.3.13). 


Examples 1.3.3. Here are four examples of the equation (1.3.1) which may be un- 
derstood with the use of transition probabilities for a certain random-walk model: 


(a) 9 (t) = 9 (2t) + @ (2t — 1), Haar’s wavelet, Figure 1.2(a); 
(b) 9 (t) = 9 (2t) + 9 (2t — 3), the stretched Haar wavelet, Figure 1.2(b); 
(c) g(t) = ¢ Bt) + ¢@ Gt — 2), Cantor’s example; 


and 


d) oft) = 48 9Qn + 241 - 1) + S292 - 2 + oer - 3), 
Daubechies’ scaling function. 


We shall be interested in solutions g, called scaling functions, to (a)}{d) which 
satisfy the further normalization 


Je (t) dt = 1. (1.3.14) 


A direct verification shows that (a) and (d) have normalized solutions g € L? (R). 

We will meet several notions of “normalization,” starting with (1.3.14). In addi- 
tion to (1.3.14), we will consider L?-normalization, defined by condition (1.3.24) be- 
low. L?-normalization means “unit L?-norm.” The stretched function g from exam- 
ple (b), henceforth denoted gp, i.e., the function in Figure 1.2(b), is Z '_ normalized, 
but not L?-normalized. Specifically, one checks that gp satisfies (1.3.14) but not 
(1.3.24). 
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(a) The Haar wavelet system (b) The stretched Haar wavelet system 


| | 
l 17 


Fig. 1.2. Haar wavelets: The two functions g and y are the father, resp., the mother functions 
for Haar’s construction in cases (a) on the left, and the stretched Haar (b), on the right. In each 
case, we use y to create a double-indexed basis system (1.3.16). For (a) this system will be 
an orthonormal basis (ONB) in L?(R); but for (b), the functions (1.3.16) will only represent 
a Parseval frame, i.e., a function system which yields the Parseval identity (1.3.22) for every 
function in L2(R). As is apparent from (b), this Parseval system will not be an ONB for the 
stretched version. We resume the discussion of this in the text following Figure 6.1 (p. 103). 


So the stretching, i.e., passing from (a) to (b) in Figure 1.2, leaves (1.3.14) 
stable, while the other two conditions (1.3.23) and (1.3.24) get lost. Specifically, 
the stretched scaling function gp also does not satisfy the orthogonality relations 
(1.3.23). 


In contrast, the two distinct scaling functions 
Pa = X[0,1] (Haar) 


and 
@d (Daubechies) 


both satisfy all three conditions (1.3.14), (1.3.23), and (1.3.24). 
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For (a), we may take 
9 = %a= Xo1p 
and for (b), 
1 
P= Po= 3%[0,31> 


where x denotes the indicator function of the respective intervals. Readers familiar 
with wavelets will recognize (d) as the scaling identity for the Daubechies wavelet: 
it has a normalized solution 9 = 9q¢ = Dau in L? (R) which is supported in the 
compact interval [0,3], see Figure 7.16 (p. 134), and differentiable, see [Dau92]. 
The solution g in (a) is the scaling function for the Haar wavelet; see Figure 1.2(a). 

Equation (c) does not have any normalized solution g in L? (R), but still 9 = x X 
solves (c) when X3 is the “middle-third Cantor set”: see Chapter 4 and Figure 4.2 
(p. 74). Moreover, there is a unique probability measure 4 = 3 supported on X3 
such that 


[romo=s ( i (5) aun | f (=) di 0) (13.15) 


for all bounded continuous functions f on R. The Cantor set 3, with associated 
measure 4, is an example of an iterated function system (IFS) of affine type, and the 
harmonic analysis of these systems is the subject of Chapter 4. 

Cantor’s scaling identity, example (c), admits a normalized solution g as follows. 
The Hilbert space will not be defined from Lebesgue measure, but rather from Haus- 
dorff measure of fractal dimension s, in this case s = log3(2). As our Hilbert space 
Hs, we use in [DuJo06b] a separable L?-space defined from the Hausdorff measure 
of fractal dimension log, (2), and consisting of all L?-functions on a certain set £3, 
extending X3, and built from X3, using scaling in the large, and certain gap-filling 
steps; see [DuJo06b]. With this construction, the scaling function g will be the indi- 
cator function of X3, having unit norm in the Hilbert space 7,5. 

The only reasonable solution to (c) is the indicator function of X3, 9 = y X3° 
This function is not normalized when referring to Lebesgue measure. The Lebesgue 
measure of X3 is actually zero, while the Hausdorff measure h,, s = log3(2), of X3 
gives us the right normalization, 4;(X3) = 1. Or stated differently, the analogues 
of the classical results are true if we modify the measure and the Hilbert space, 
using instead the Hilbert space H, in place of L?(R). (Recall L7(R) is defined from 
Lebesgue measure.) 

Rather, the new Hilbert space 11, is defined as an L? space with respect to Haus- 
dorff measure hs, s = log; (2). For (c), we have the following true version of (1.3.14), 
which we can call (1.3.14);: 


[vo dh;(t) = 1, ~ = Xx, [indicator function]. (1.3.14); 
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As stated, when the integral in (1.3.14) is computed on the solution to (c), referring 
to Lebesgue measure, it is = 0. 

The notion of “normalization” which is independent of the choice of Hilbert 
space involves the function / in (1.3.10). Of the four examples, (a)~(d), the three (a), 
(c), and (d) have / = 1, the constant function, while (b) has 4 non-constant, and h 
for this stretched wavelet is given by formula (6.2.17), p = 3, and sketched in Figure 
6.2 (p. 107). 

We postpone a full discussion of (c) to Chapter 4. This fractal case, i.e., the 
middle-third Cantor example, and a related much wider class of affine fractals, will 
be treated in detail and in a wider context in Sections 4.1 and 4.2; see also the paper 
[DuJo06b]. 

So in the list of four, (a}{(d) above, the only one that stands out as being different 
is the stretched Haar wavelet (b). Of course, all the fractal wavelets admit stretched 
versions. But we do not discuss those in the present book. However, it is true more 
generally that all the Cantor sets that arise as affine iterated function systems (IFSs) 
admit orthonormal wavelet bases; see [DuJo06b]. 

The fractal wavelets we treat in Chapters 4 and 6 are the ones that are orthonormal 
bases, and in particular, they are normalized. 

As a result, the function / from (1.3.10), called the minimal eigenfunction, is the 
constant | in the examples from cases (a), (c), and (d), but not (b). 

In the “stretched” example, case (b), we calculate the minimal eigenfunction / 
in (6.2.17) by a closed expression, the case p = 3; see also Figure 6.2 (p. 107). 

An important question for dyadic wavelets in L? (IR) is the issue of when these 
wavelets form orthonormal bases (ONBs). A dyadic wavelet function y € L? (R) 
generates an ONB if the double-indexed family 


{2"?y (2"1-k) |nke Z| (1.3.16) 
satisfies (i) and (ii) below: 
©) [VoD vims 0) dt = dnb with 


Vn,k (t) = 2"? y (2"t —k), (1.3.17) 
and 
(ii) the closed linear span of { wn,4 | 7, k € Z} is L? (R). 
In our analysis of the scaling identity 


g(t) =2 > ag Qt —k) (1.3.18) 


keZ 


(a special case of (1.3.1)), we will be looking at two functions g and w; the second 
one may be taken to be 
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y(t) =2>° (-1)*! aap Qt - 2). (1.3.19) 
keZ 


This analysis is the approach to wavelets which goes under the name of multires- 
olution analysis. The function y which is used in (1.3.16}-(1.3.17) is the solution 
to (1.3.19). The two standing conditions which are placed on the numbers (a;);<7z,; 
called masking coefficients, are 


1 
> akaks2n = 560n, ne Z, (1.3.20) 
2 
keZ 
and 
Yaw = 1. (1.3.21) 
keZ 


These conditions in themselves do not imply orthonormality in (1.3.17), but only the 
following much weaker property: 


LH lvl MP =u = fora, fem. 0322) 


n,keZ 


A system of functions (Wn,k) satisfying (1.3.22) is called a Parseval frame, or a 
normalized tight frame. 

It turns out that the ONB property for the wavelet is equivalent to either one of 
the following two conditions for the normalized scaling function ¢: 


i 9 Oot —k dt =do., (1.3.23) 


or 
llz2q = 1. (1.3.24) 


Using this, the reader may check that the ONB property is satisfied for the two ex- 
amples (a) and (d), but not for (b). However, the Parseval-frame property is evi- 
dently satisfied for example (b), ie., for the stretched Haar wavelet. While in (b), 
Q= 3 X0,3)° the corresponding wavelet function y is 


1 
= < = 
3° EEA Ss 
1 3 
ti=e—- = -~<t<3 
y (t) ge Bars 
0 otherwise; 


see Figure 1.2(b). 
The relevance of random-walk probabilities for understanding of the distinction 
between Parseval frames and ONBs is treated systematically in Chapters 2 and 6. 
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Summary of the conclusions from Examples 1.3.3. 


e Condition (1.3.14): yes for (a), (b), and (d). No for (c)! 

e Condition (1.3.14)s, s = log3(2), yes for (c). The integral with respect to the 
Hausdorff measure hy is oo for ga, Yb, and gq. 

e Condition (1.3.24): yes for (a), (d). No for (b) and (c)! 

e Condition (1.3.24),, s = log3(2), ie., referring to the Hilbert space H;. Now yes 
for (c)! For s = log3(2), the Hilbert 7{;-norm of ga, gp, and gq is 00. 

e Condition (1.3.23), orthogonality: Yes for (a), and (d); but no for (b). 

e Condition (1.3.23),, s = log; (2). Yes for (c), and no for the other three! 


Remark 1.3.4. In an earlier textbook [BrJo02b], we consistently used the notational 
convention mo(0) = \/N for the so-called frequency response function mo; but in 
engineering, the alternative convention, mo(0) = 1, is less confusing. This is because 
the convention mo(0) = 1 captures the probabilistic meaning of “low-pass” better 
than the one from [BrJo02b]; see the “References and remarks” section at the end 
of Chapter 9 for a discussion of this point. (Here we have W = |mo|*.) Also notice 
that our current convention, 79(0) = 1, is consistent with the pointwise convergence 
of the infinite products (1.3.5) and (1.3.12). Various versions of, and approaches to, 
these and related infinite products will play a central role in the next two sections, 
and in fact throughout the book. 


1.4 Sampling 


In this section and the next, we study an intriguing relationship between the following 
three problems: 


(1) When does the scaling identity (1.3.1) have L? solutions? 

(2) How may the transition probabilities P, be used in sampling certain functions at 
the points x + k as k runs over a set of integers? 

(3) When is the infinite product (1.3.5) pointwise convergent? 


In the wavelet applications, X = [0,1], and the system o, to,..., Tw-1 iS as 
follows: 
a (x) = Nx mod 1, 
et 9 (1.4.1) 
yR)=—>> jJ=090,1,...,N—1. 


See also Figure 4.1 (p. 73) in Chapter 4. 
Lemma 1.4.1. Setting F (x) := P, ({0}) and 


k=ijtigN+---+igN™! — (ENo), (1.4.2) 
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we get the formula 
Py {k) = FO+4), (1.4.3) 


where we have identified k with the point w (k) := (1, ..., in, 0) inQ. 


Proof. To see this, identify functions on [ 0, 1 ] with 1-periodic functions on R, and 
note that the second formula in (1.4.1) yields 1;, --- 1), ~) = (« +4) /N” where k 
is given by (1.4.2). Hence, if 1 < s <n, then 


iN +--+ +isNS7! k 
Hepa y (EEA) w (24). 


It follows that (1.4.3) is really just a rewrite of (1.3.13). The right-hand side of 
(1.3.13) yields 


w (=)... (SSA) (TE) = rect. (1.4.4) 


which is the desired conclusion. oO 


Remark 1.4.2. To extend P, (-) from No to Z, recall that k € No is identified with 
the singleton w (k) = (11, ..., in, 0) via (1.4.2). Now, if -—N” < k < 0, then set 


Py ((k)) := Ps ({ (ve af k)}) (1.4.5) 


To help the reader gain some intuitive feeling for the conclusion in Lemma 1.4.1, 
observe that the right-hand side of (1.4.3) represents a sampling of the function F 
at the integral translates on R, starting at x, i.e., x + k. Obviously, different subsets 
of Q would yield different sets of sampling points for F’, including non-uniformly 
distributed sampling points; see [AlGr01]. 

Starting with Shannon [Sha49], the theory of sampling has emerged as a signif- 
icant tool in signal processing; see, e.g., the beautifully written survey [AlGr01] as 
well as the references cited therein. Thus, in a general context, our formula (1.4.3) 
offers a probabilistic prescription for the sampling of functions on the real line, and 
at the same time it stresses the “random” feature of sampling. 


1.5 A convergence theorem for infinite products 


We now show how this viewpoint from sampling theory is closely related to some 
fundamental properties of the measures P,,. In particular, our Theorem 1.5.2 gives a 
necessary and sufficient condition for pointwise convergence of the infinite product 
(1.3.5), or more generally (1.2.5), with the condition for convergence stated in terms 
of the harmonic function / of (1.3.10). The relationship between / and the measure 
family P,, is studied more systematically in Sections 2.7 and 3.3 below. 
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Let A C Z. Returning to (1.4.3), we set 


P, (A) := > Py (A) = LF @ +h. (1.5.1) 
keA keA 
As in Lemma 1.4.1, the number NV, N > 2, and the function W are given. The 
measures P, are constructed from these data using (1.2.4), and we have the two 
functions 

F(x) := P,({ (0, 0, 0, ...)}) (1.5.2) 

— 

oo string of zeroes 


and 
h(x) := Py, (Z), xeR. (1.5.3) 


We shall meet this function / at several places later in the book. It is a distin- 
guished non-negative eigenfunction (corresponding to eigenvalue 1) for the transfer 
operator Ry associated with the weight function W, and our next result, Theorem 
1.5.2, shows that a property of 4 at points x determines the convergence of a crucial 
infinite product built from W at the same point x. The infinite products are studied 
systematically in Chapter 5, while Section 2.7 describes a wider class of eigenfunc- 
tions for Ry. We show in Theorem 2.7.1 that the l-eigenspace E; (Rw) for Rw 
admits a boundary representation which mimics a classical boundary representation 
from harmonic analysis, and we outline the role / plays: It is characterized by a 
certain minimality property among functions in £) (Ry). This in turn is motivated 
by a certain Perron—Frobenius context for the operator Rw which is studied more 
generally in Chapter 6. 

Finally, for k € N, set 


WZ = {Nii ljeal. (1.5.4) 


Using (1.4.5), we see that N¥Z is represented in Q = {0,1,..., N— 1} as 


(0, ..., 0, @1, a, ..., ., 0, 0, 0, ...). (1.5.5) 
SS cee 
k zeroes a finite string oo string of zeroes 
of symbols 
oj €{0,...,N—1} 


An infinite string of zeroes will be denoted 0. 


Lemma 1.5.1. Let N, W, F, P,, and h be as described above; see (1.5.2)1.5.3). 
Then h satisfies the following cocycle identity: 


h(x) W (x) = Pyx(NZ), xeR. (1.5.6) 


In particular, if h(x) = 1 ae.x € R, then we recover the function W from the 
transition probabilities P,, a.e.x € R. 
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Proof. We calculate the left-hand side in (1.5.6), using the earlier equations: 


h(x) W(x) = We) DF @ +S) 
by (1.43) 
and (1.5.3) jéeZ 
by periodicity 2 @+/)Fa+s) 
by (1.4.4) 2 (N (x + /)) 
= DiF(Wx+Ni) 
jeZ 
Ty) Dy Pw (NAD) 
by(l43)  <y 
by (1.5.1) x (NZ) 
This is the desired identity (1.5.6), and the proof is completed. Oo 


Theorem 1.5.2. Let N, W, F, P,, and h be as described above. Let x € R, and 
suppose that P, ({0}) > 0. Then the following two conditions are equivalent. 


(a) The limit on the right-hand side below exists, and 


(b) The limit on the left-hand side below exists, and 


aod ae 


Proof. (a) => (b). An iteration of the identity (1.5.6) in Lemma 1.5.1 above yields 


k 
kz) — Bua 28 
», (w*2) -(I1¥ (s))*Ge)- 57 
Using (1.5.5), and working in ©, we find that 
(| N‘Z = {0}. (1.5.8) 
keN 
An application of a standard result in measure theory [Rud87, Theorem 1.19(e), 


p. 16] now yields existence of the following limit: 


. k _ — 
lim, Ps (W Z) =P), = FO). (1.5.9) 


Since (a) is assumed, and F (x) > 0, we conclude that the limit A (x/N*), for k > 
oo, must exist as well, and further that 
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. x 
EO SEO) lanky) 


Using again F (x) > 0, we finally conclude that (b) holds. 

(b) => (a). Recall that formula (1.5.7) holds in general. If (b) is assumed, we then 
conclude that the limit Wei W (x/N/) exists as k —> oo. The limit on the left-hand 
side in (1.5.7) exists and is F (x). The conclusion (a) follows as a result: 


k k 
EG) = ce El W (a7) pe (<x) ene i) eet ‘a al : A 


1.6 A brief outline 


In Chapter 2, we study random walk as a Markov process, and we compute the 
Markov transition measures P, as x varies over the set _X. The construction of P, 
and the probability space Q begins with a sequence of measures PO) n= Ly 2 yates 
corresponding to finite paths of length n. Then we show that the infinite-path-space 
measure, Lemma 2.4.1, results from an application of the Kolmogorov extension 
principle. Our technical analysis involves a certain transition operator R which gen- 
eralizes Lawton’s wavelet transition operator; see [Law9 1a, Law9 1b]. 

We already mentioned how the measures P, serve to prescribe sampling of 
functions on the real line, Lemma 1.4.1. However, a deeper understanding of this 
sampling viewpoint is facilitated by the introduction of the Perron~Frobenius—Ruelle 
operator R, Definitions 2.3.1, and an associated family of harmonic functions. In fact, 
in Theorem 2.7.1, we make precise the concepts of boundary and of boundary val- 
ues for these harmonic functions. Our proof of Theorem 2.7.1 uses the convergence 
theorem for martingales, and we outline the martingales associated with our general 
transition operator R. 

In Chapter 3, a fundamental tightness condition for the random-walk model is 
introduced. The transition probabilities P, live in a universal probability space Q, 
but the essential convergence questions for the infinite products (see Chapter 2) de- 
pend on the the P,’s being supported on a certain copy of No (the natural numbers), 
or of Z (the integers). This refers to the natural embedding of Z in Q which we 
described above (and further in Chapter 2). These concepts and ideas are presented 
systematically in Chapter 3. 

In Chapter 4, we study a family of examples of random walk on fractals, focusing 
on simple examples of affine iteration on the line, especially the middle-third Cantor 
set, X3 (Figures 4.1 and 4.2), and its cousin X4 (Figures 4.1 and 4.3) arising from 
quarter divisions. We recall from [JoPe98] that X4 has a natural orthonormal Fourier 
basis, while X3 has no such basis of Fourier frequencies. The difference between 
the two cases may be understood from an analysis of the random walks and the 
associated infinite products and transition probabilities. 
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We also recall from [DuJo06b] that both X3 and a wider class of affine fractals 
have orthonormal wavelet bases, the so-called gap-filling wavelets. 

For these fractal examples and also for the wavelet applications, the basis proper- 
ties are determined by the size of the integers Z naturally embedded in our probability 
space Q: the question is when P, (Z) = 1, i.e., when Z has full measure in Q. 

In Chapter 5, we prove a general theorem, Theorem 5.4.1, which captures all the 
examples. We prove that if a certain family of tail-sets in Q is not negligible, then 
the condition P, (Z) = 1 is satisfied. 

In general, the function x » P, (Z) is not constant. In fact, we show in Chapter 
6 that this function is harmonic. Moreover, it is minimal in a sense we make precise. 

In Chapters 7 and 8 we give a number of applications: wavelets, wavelet packets, 
generalized multiresolutions, wavelet filters that take matrix values, and we consider 
applications which depend on a certain family of infinite random products of matrix 
functions. 

In Chapter 9, we study pairs of representations of the Cuntz algebras. While this 
is a subject from operator algebras, it turns out to apply to the examples from Sections 
8.2, 8.3, and 8.4, 1.e., to wavelet packets, and to the corresponding measure-theoretic 
issues. 
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The FBI has a database consisting of some 200 million fingerprint records 
... As part of a modernization program, the FBI is digitizing these records 
as 8-bit grayscale images, with a spatial resolution of 500 dots per inch. This 
results in some 10 megabytes per card, making the current archive about 
2,000 terabytes in size. —C.M. Brislawn 1995 


One of the reasons wavelets have found so many uses and applications is that 
they are especially attractive from the computational point of view. Traditionally, 
scale/translation wavelet bases are used in function spaces on the real line, or on the 
Euclidean space R?. Since we have Lebesgue measure, the Hilbert space L*(R?) 
offers the natural setting for harmonic analysis with wavelet bases. These bases can 
be made orthonormal in L?(R?), and they involve only a fixed notion of scaling, for 
example by a given expansive d-by-d matrix A over Z, and translation by the integer 
lattice Z¢. But this presupposes an analysis that is localized in a chosen resolution 
subspace, say Vp in L?(R7). That this is possible is one of the successes of wavelet 
computations. Indeed, it is a non-trivial fact that a rich variety of such subspaces Vo 
exist; and further that they may be generated by one, or a finite set of functions g in 
L?(R¢) which satisfy a certain scaling equation [Dau92]. 

The determination of this equation might only involve a finite set of num- 
bers (four-tap, six-tap, etc.), and it is of central importance for computation. The 
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solutions to a scaling equation are called scaling functions, and are usually denoted 
g. Specifically, the scaling equation relates in a well-known way the A-scaling of the 
function(s) ¢ to their Z¢-translates. 

The fact that there are solutions in L7(R7) is not at all obvious; see [Dau92]. In 
application to images, the subspace Vo may represent a certain resolution, and hence 
there is a choice involved, but we know by standard theory, see, e.g., [Dau92], that 
under apropriate conditions such choices are possible. As a result there are extremely 
useful and computationally efficient wavelet bases in L7(IR2). A resolution subspace 
Vo within L?(IR?) can be chosen to be arbitrarily fine: Finer resolutions correspond 
to larger subspaces. 

As noted for example in [BrJo02b], a variant of the scaling equation is also used 
in computer graphics: there data is successively subdivided and the refined level 
of data is related to the previous level by prescribed masking coefficients. The latter 
coefficients in turn induce generating functions which are direct analogues of wavelet 
filters; see the discussion in Section 1.3, Chapters 7 and 9, and the Appendices. 

One reason for the computational efficiency of wavelets lies in the fact that 
wavelet coefficients in wavelet expansions for functions in Vo may be computed us- 
ing matrix iteration rather than by a direct computation of inner products: the latter 
would involve integration over R?, and hence would be computationally inefficient, 
if feasible at all. The deeper reason why we can compute wavelet coefficients using 
matrix iteration is an important connection to the subband-filtering method from sig- 
nal/image processing involving digital filters, down-sampling and up-sampling. In 
this setting, filters may be realized as functions mo on a d-torus, e.g., quadrature- 
mirror filters. 

As emphasized for example in [Jor05] and [Bri95], because of down-sampling, 
the matrix iteration involved in the computation of wavelet coefficients involves so- 
called slanted Toeplitz matrices F from signal processing. The slanted matrices F are 
immediately available; they simply record the numbers (masking coefficients) from 
the g-scaling equation. Further, these matrices have the computationally attractive 
property that the iterated powers F* become sucessively more sparse as k increases, 
i.e., the matrix representation of F* has mostly zeroes, and the non-zero terms have 
an especially attractive geometric configuration. In fact subband signal processing 
yields a finite family, F’, G, etc., of such slanted matrices: for example, with Z for 
“low frequency” and H for “high frequency,” 


p= 2 P9@ --k), w= a Oxo(2 » —k), (1.7.1) 


1 
fe ys Va > Ph-anXks 
k 
(1.7.2) 


1 
Caf = a »; Ok—2nXk- 
K 
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Fig. 1.3. The slanted matrices F' and G of (1.7.2). 


The associated pair of slanted matrices F and G are sketched in Figure 1.3. 

Before getting to the theory behind the slanted matrices, we will need some 
preparation, and the matrices will then be motivated and studied systematically in 
Chapter 7 and again in Chapter 9. 

In brief outline, the numbers P; are entered into the rows of the matrix F via 
a specific slant-pattern. The placement is initiated in row/column. position 0, 0 as 
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follows: The number Pp is used in the row/column position 0, 0, and as we move to 
the next row one number down, the same row is simply repeated, but with a shift to 
the right by two places. 

Similarly, we shift two places to the left for each one row-move up in the matrix. 
The resulting slanting is seen in Figure 1.3. Moreover, the matrix G is created from 
the numbers Q; in exactly the same way. 

The wavelet coefficients at scaling level k of a numerical signal s from Vo are then 
simply the coordinates of GF*s. By this we mean that a signal in Vo is represented 
by a vector s via a fixed choice of scaling function; see [Dau92, BrJo02b, Jor03]. 
Then the matrix product G F* is applied to s; and the matrices G F* get more slanted 
as k increases. 

Our approach begins with the observation that the computational feature of this 
engineering device can be said to begin with an endomorphism r,4 of the d-torus 
T? = R¢/Z4, an endomorphism which results from simply passing matrix multipli- 
cation by 4 on R? to the quotient by Z7. It is then immediate that the inverse images 
re (x) are finite for all x in T?, in fact #r7! (x) = |det A]. From this we recover the 
scaling identity, and we note that the wavelet scaling equation is a special case of a 
more general identity known in computational fractal theory and symbolic dynamics 
[Jor04a]. We show that wavelet algorithms and harmonic analysis naturally general- 
ize to affine iterated function systems. Moreover, in this general context, we are able 
to build the ambient Hilbert spaces for a variety of dynamical systems which arise 
from the iterated dynamics of endomorphisms of compact spaces [DuJo06b]. 

As a consequence, the fact that the ambient Hilbert space in the traditional wave- 
let setting is the more familiar L(IR@) is merely an artifact of the choice of filters 
mo. As we further show, by enlarging the class of admissible filters, there is a variety 
of other ambient Hilbert spaces possible with corresponding wavelet expansions: the 
most notable are those which arise from iterated function systems (IFSs) of fractal 
type, for example for the middle-third Cantor set and scaling by 3. 

With examples, theorems, and graphics, we hope to bring these threads to light: 
The journey from wavelets to fractals via signal processing. 

More generally (see Chapters 4 and 5), there is a variety of other natural dynam- 
ical settings (affine IFSs) that invite the same computational approach. 

The two most striking examples which admit such a harmonic analysis are per- 
haps complex dynamics and subshifts. Both will be worked out in detail. In the first 
case, consider a given rational function r(z) of one complex variable. We then get 
an endomorphism 7 acting on an associated Julia set X in the complex plane C as 
follows: This endomorphism r:.X — X results by restriction to X [Bea91]. (De- 
tails: Recall that X is by definition the complement of the points in C where the 
sequence of iterations r” is a normal family. Specifically, the Fatou set F of r(z) is 
the largest open set in C where r” is a normal sequence of functions, and we let X 
be the complement of F. Here r” denotes the n’th iteration of the rational function 
r(z).) The induced endomorphism r of X is then simply the restriction to _X of r(z). 
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Ifr then denotes the resulting endomorphism, 7: X — _X, it is known [DuJo06a] that 
#r~!(x) = degree of r, for every x in X (except for a finite set of singular points). 

In the second case, for a particular one-sided subshift, we may take X as the 
corresponding state space, and again we have a naturally induced finite-to-one endo- 
morphism of X of geometric and computational significance. 

But in the general framework, there is no natural candidate for the ambient 
Hilbert space. That is good in one sense, as it means that the subband filters mo 
which are feasible will constitute a richer family of functions on_X. 

In all cases, the analysis is governed by a random-walk model with successive 
iterations where probabilities are assigned on the finite sets #r—! (x) and are given by 
the function W := |mo|*. This leads to a transfer operator Rw (Section 2.3) which 
has features in common with the classical operator considered first by Perron and 
Frobenius for positive matrices; in particular, it has a Perron—Frobenius eigenvalue, 
and positive Perron—Frobenius eigenvectors, one on the right, a function, and one 
on the left, a measure; see [Rue89]. As we show in Chapters 2 and 6, this Perron— 
Frobenius measure, also sometimes called the Ruelle measure, is an essential ingre- 
dient for our construction of an ambient Hilbert space. All of this, we show, applies 
to a variety of examples, and as we show, has the more traditional wavelet setup as a 
special case, in fact the special case when the Ruelle measure on T? is the Dirac mass 
corresponding to the point 0 in T? (additive notation) representing zero frequency in 
the signal-processing setup. 

There are two more ingredients entering in our construction of the ambient 
Hilbert space: a path-space measure governed by the W-probabilities, and certain 
finite cycles for the endomorphism r. For each x in _X, we consider paths by infinite 
iterated tracing back withr—! and recursively assigning probabilities with W. Hence 
we get a measure P, on a space of paths for each x (Section 2.4). These measures 
are in turn integrated in x using the Ruelle measure on_X. The resulting measure will 
now define the inner product in the ambient Hilbert space. 

Our present harmonic analysis for these systems is governed by a certain class 
of geometric cycles for r, ie., cycles under iteration by r. We need cycles where 
the function W attains its maximum, and we call them W-cycles. They are essen- 
tial, and we include a discussion of W-cycles for particular examples, including their 
geometry, and a discussion of their significance for the computation in an orthogo- 
nal harmonic analysis; see especially Chapter 9. We give a necessary and sufficient 
condition for a certain class of affine fractals in IR? to have an orthonormal Fourier 
basis, and we even give a recipe for what these orthonormal bases look like. We 
believe that this throws new light on a rather fundamental question: which fractals 
admit complete sets of Fourier frequencies? Our results further extend earlier work 
by a number of authors, and in particular, clarify the scale-4 Cantor set on the line, 
considered earlier by the author and S. Pedersen [JoPe98], and also by R. Strichartz 
[Str00, Str05], and I. Laba and Y. Wang [1.aWa02]. 
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Exercises 


Problems worthy 
of attack 
prove their worth 
by hitting back. 
—Piet Hein 


1.1. Let Zz = {0, 1}, and Q := []$° Zo. 
(a) Forn € N, i; € {0,1}, 1 <j <n, set 


Ati, ..-5in) = (OE Q|o;=i;,1 <j <n}. 


Show that @ and these sets generate a topology J and a c-algebra B, and moreover 
that 
T SBP, 


where P denotes the set of all subsets of Q. We shall let TJ and B denote topology 
and o-algebra, respectively, whenever continuity or measurability is needed, unless 
otherwise specified. 

(b) Show that Q is uncountable. 

(c) Show that (Q, 7) is a compact space. 

(d) Show that Q and the Cantor set X3 (with its usual topology) are homeomor- 
phic. 

(e) Show that the functions v Aliy,...,in) S@Parate points in Q. 

(f) Using the Stone—WeierstraB theorem, show that the algebra generated by the 
constant function 1 on Q and the functions y Ad, ) is dense in C (Q). 

(g) Let p € (0, 1) be fixed, and set Z (1) = i pa 


L (XAi1.-.in)) = pis Jij=l}. ad - py k | ix=0} : 


Using (f), show that L extends uniquely by linearity to C (Q), and that there is a 
measure 44 = {4p on B such that 


Lin= [fan for all f e CQ). 


(The measure yy is called the p-Bernoulli-product measure.) 


1.2. Let the measures yp be as in Exercise 1.1(g). Show that for p # p’, the two 
measures 4, and 4, are mutually singular. 


1.3. (a) Show that Zz is an abelian group under addition mod 1 in Z. 
(b) Show that Q in Exercise 1.1 becomes a compact abelian group under the 
infinite-“product” operation, i.e., with 
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(a) + (a) = (@ +4), 


where @; + «} is the addition in Zp. 
(c) The Haar measure yu of Q is defined on B from Exercise 1.1(a) and satisfies: 


e yz isa measure (c-additive) defined on the o-algebra B; 
H(Q) = 1; 
e “(E+o) = “(B) forall FE ¢ Bandw € OQ, where E+o:= {€+o|Ee E}. 


Find the Haar measure y of Q. Explicit formula! 

Hint: Take p = 1/2 in Exercise 1.1(g), and show that 1/2 satisfies the three 
listed conditions. You may use the fact that Haar measure is unique subject to these 
conditions. 

(d) Set 


A := {A= (Aiieny | Ai € Zo, Ai = 0 except for a finite number of places } , 


and set 
x4 (@) = [J e*™, 
k=0 
and then show that y, is a well-defined function on Q for all 1 € A. 


(e) Show that each 7, is continuous and satisfies: 


© Xp (@) = 1 for allw € Q; 
e© x, (@ +’) = x, @) x, (o’) forall A € A anda, o' € Q. 


(Note: x, is called a character on the group Q.) 
(f) Show that the linear span of the characters { my i\4EA } is dense in C (Q). 
(g) Show that 


ez (@) x, (@) du (@) = 4),1" 
for 1, A’ € A, where uw is Haar measure. 


1.4. Definition: An indexed family {h, | 4 € A} ina Hilbert space 1 is said to be 
a frame (or a frame basis) if there are positive constants A and B such that 


AIP < MIA) < BIS? forall fe H. (1E.1) 


AEA ; 


Returning to the Cantor group © and its dual A, we examine the family 


{yl4eA} 


in the Hilbert space L? (Q, up) for each p,0 < p <1. 
(a) Show that { x, | 2 € A} is an orthonormal basis in L? (Q, yp) if and only if 
pea 1/2. 
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(b) Show that { 7, | 4 € A} is nota frame basis in L? (Q, «p) when p # 1/2. 

Hint: For each n, consider the subset A,, C A of points 4 = (A;) such that 
A; = 0 fori > n; and for each k, 0 < k < n, the subset A, (kK) C A, consisting of 
A = (Aj) in Ay such that #{i | A; = 1} = k. Let f be the constant function 1, or 
equivalently, f = x». Using #A, (k) = (2), justify the following calculation: 


n 


Sarr =>: > xa tf) 


AEAn k=0 AEAn(K) 
n n n 
= > ( yu —2pl* = ¢ ea —2p/’) 
k=0 k 


Since this limit is oo for n + oo whenever p # 1/2, the conclusion follows. 

(c) Is there a positive lower frame bound for { x, | 4 € A} in the Hilbert space 
L? (Q, up), p # 1/2? In other words, is there some A > 0 such that the lower 
estimate in the relation (1E.1) is valid for all f € L? (Q, up)? 


1.5. Using the idea from Exercise 1.1(a), show that there is a natural orthonormal 
basis (ONB) for L? (Q, 441/2). 
1.6. Can you find a countable ONB for L? (Q, 41/2)? 
1.7. Let gq and og denote the functions in the top part of Figure 1.2 (p. 13). Which 
of the two families of functions is orthogonal in L? (R), 

{ga(- —k)|keZ} or {gp(- -k)|keZ}? 


1.8. Let wg and wy denote the two functions in the bottom part of Figure 1.2 (p. 13). 
(a) Which of the two families of functions 


{2/7 Wa (21 -k) like Z} or {2/7 ws (2/*-#) |ike Zz} 


form ONBs in L? (R)? 
(b) Give a direct argument with pictures, showing that the double-indexed family 
of functions 


vi ©) = 2! ya (2/t 4) 
forms an ONB in L? (R). 


(c) Give a direct argument with pictures, showing that the double-indexed family 
of functions 


VR (t) =? wy (2/1 = k) 


Ile = > vB lr If 
J,keZ 


satisfies 


for all f € L? (R). 
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1.9. Consider the four versions of the scaling identity on R listed in Section 1.3, 
Example 1.3.3, under (a), (b), (c), and (d). 

Fill in the missing arguments and check that the conclusions listed in the sum- 
mary at the end of Section 1.3 are correct. 


1.10. Let g € L? (R), and consider the ansatz, for some operator W = Wo, 


Wo (Sa6 (- -») = (sk), 
keZ 


where (sx) is some sequence. 

(a) Give a necessary and sufficient condition on the function g for when W, 
defines a bounded linear operator from the natural closed subspace Vp C L? (R) 
into €2 = €? (Z). 

(b) Give a necessary and sufficient condition on the function g for when W, 
defines an invertible operator from V, onto €?. 

(c) Give a necessary and sufficient condition on the function g for when W, 
defines an isometric isomorphism of V, onto e?. 


The remaining exercises in this chapter are most likely review for mathemat- 
ics students: We hope that they make it easier for the reader to build up (or more 
likely review!) prerequisites that first-year grad students might need to refresh, as 
they move from one chapter to the next: Fourier, Hilbert, bases, a little on linear 
operators, etc. All standard material first year grad students have seen but perhaps 
might want to brush up on: Learning by doing! 


1.11. Hilbert space (the separable case) 


Definition (the list of axioms) 


e HH: vector space over the complex field C. 


e (-|-):HxH-C, satisfying 
(i) x } (x | y) is conjugate linear for all y € 7, 
(ii) y > (x | y) is linear for all x € H, 
(ili) (x | y) =(y|x),x, ye H, 
(iv) (x |x) > 0 forall x € H, and 
(v) (x |x)=0>x=0. 


e Setting ||x|| := (x |x )}/2 turns H into a complete normed space, i.e., if some 
sequence (Xn)nen Satisfies ||x, — Xml i: 0, then there is some x € H such 
: oe) 


that |x —x,|| — 0. 
n—- oo 


Review your real-variables book, and then show that the following are examples 
of Hilbert spaces: 
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e H:= CK = the k-dimensional complex vector space with x = (x1,...,Xk), 
x; €C for] <j <k,and 


k 
(x4 yi > xy 
j=1 


Jj 


e H:=¢? =all sequences x = (x,) ,.. such that "°°; |x; |? < 00, where we set 


co 
(x 1y) = Do ey. 
j=l 


e + := €7(A) where A is some fixed countable set = all functions x:4 > C 
satisfying >* <4 |x (a)|? < 00, and where we set 


(x|y):= ox @y@. 


acd 


Does this construction work also if A is not countable? What is €2 (A) if A is not 
countable? 


e Let (X, B, ~) be ao-finite measure space, and set H := L? (X, B, Lt), or L? (u) 
for short, i.e., all measurable functions f on X satisfying 


[PP ano) <0, 
where we set 
(flei= | 7FHee ae, 
and where we identify two functions /; and fo if there is a subset S with uw (S) = 
0, and ff = fron X\S. 
1.12. Give your own proof of the following basic facts about Hilbert space H. 


© ((x]y¥)| < xl yl. x, » € H (Schwarz’s inequality). 

e A linear mapping L:7H — C (functional) satisfies |Z (x)| < Const. ||x|| if and 
only if it has the form LZ (x) = (y | x) for some y € H (Riesz’s theorem). 

1.13. Operators 


Let #1;, i = 1, 2, be two Hilbert spaces and let T: 711 — H2 bea linear transfor- 
mation. Show that the following conditions are equivalent. 


e T is continuous relative to the respective norms on 71; and H2. 
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e ||Tx\l. < C [xl]; Vx € 7, holds for some finite constant C. 


@ |( Tx, |x2)| < C Heil, llxall, Vx; € Hy, i = 1,2, holds for some finite con- 
stant C. 


Show that the best constant C is the same in the two estimates. 


1.14. Fourier series 


(a) We shall identify the interval J = [0,1) and T := {ze€C| |z|=1} via 
z = e!2™! Show that this identification respects the usual structures of topology and 
measurable subsets for / and for T. 


(b) Show that the restriction of the usual Lebesgue measure on R to / induces a 
unique normalized measure on T, and that this measure satisfies 


H(ZE) = “(E) 
whenever z € T and £ is a measurable subset of T. (Here zE := {zw | w € E}.) 


(c) For n € Z, set e, (z) := z”. Show that the functions { e, | € Z} on T are in 
L? (T), and satisfy the two conditions 


© (én |&m) =Onm,n,m © Z; and 


© feL2(T) & (eg| f) =O forallneZ= f =0. 


(A system of functions with these properties is called an orthonormal basis (ONB).) 
(d) If f € L* (1), set 


UN) = Fo) = en f= [anf aw, 
and show that U: L? (T) — €? (Z) is linear, isometric, and maps onto €? (Z). 
(e) Deduce from (d) that the identity 


> |fm| = I if? du (Parseval) 


neZ 


holds for all f € L? (T), and then make precise the following representation: 


f@=> f@2”, zeT. (1E.2) 
neZ 
(f) Review your real-variables book and check that the last identity (1E.2) holds 
a.e. on T, i.e., the series (1E.2) is convergent on T except possibly on a subset of 
measure zero. (This is a difficult theorem due to Lenard Carleson.) 
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History 


Naturally infinite products have a rich history in mathematics and its applications; 
but in this book, the use of infinite products and applications to wavelet algorithms 
and harmonic analysis are emphasized. This serves as a modern key between analysis 
and probability: Perhaps somewhat surprisingly, the wavelet connection started with 
signal processing. In fact, the use of digital filters in wavelet theory dates from the 
1980s, with the pioneering work of S. Mallat [Mal89], A. Cohen [Coh90], W. Lawton 
[Law91a, Law91b], among others. And fundamental ideas of Y. Meyer [Mey79] and 
I. Daubechies [Dau92] loom in the background. For this interdisciplinary connection, 
we further refer to the papers cited in [Dau92] and in [BrJo02b]. Another element 
is the transfer operator, or the wavelet transfer operator. It has many names, e.g., the 
Perron—Frobenius—Ruelle operator. Or by now this transfer operator is often called 
the Ruelle operator because of D. Ruelle’s use of it in the 1960s on phase-transition 
problems in statistical mechanics [Rue69]. As we hope will become clear in the 
first two chapters below, the transfer operator (in any one of its many incarnations) 
serves as a crucial mathematical link connecting diverse interdisciplinary trends. The 
probability aspect of this endeavor was stressed by R. Gundy [Gun00] and others, 
e.g., [CoRa90]. Our presentation here relies in crucial ways on Gundy’s viewpoint, 
as it seems to unify the different threads in the subject that came before it. 

It turns out that the marriage of signal processing and wavelets inspired the gen- 
eration of new algorithms that now go under the name of wavelet algorithms. The 
algorithms have further served to make wavelets useful to engineers and the medical 
community, among others. 

Mathematically, the choice of closed subspaces (V,,) in Hilbert space to model 
resolutions is inspired by optics and image processing. So this inspiration came from 
physics, and from very practical applications. (And hence the terms resolution, pix- 
els, and level of detail have now entered mathematics.) In the Hilbert-space context, 
the relative complement W,, of two successive spaces V,, from the nested scale of 
spaces represent a level of detail, and the reader may find it helpful to think of the 
elements in some space W,, as representing the detail level in an image. But at the 
same time, this is also the way to think of algorithms, much the same way we think of 
numerical algorithms based on the positional number system. The positional number 
representation closely mirrors the wavelet context. In the wavelet algorithm, there 
is a scaling operation which implements a kind of similarity (to be made precise in 
Chapter 5) between level n and the next level n + 1 in the finer scale of resolutions. 
Hence we have V,, realized as a subspace of V;,41. The intersection of the spaces V;, 
is {0} and the union is dense in Z7(R). 

Mathematically, this viewpoint has inspired the use of pyramid algorithms in 
wavelets (see Chapters 7, 8, and 9). The viewpoint is versatile, and applies equally 
well to one and several dimensions, as we shall see. 
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Infinite products have played a central role in a number of classical problems for 
more than a hundred years. Recall some very familiar cases: 


(1) The factorization over the primes of the Riemann zeta function [Edw01], and its 
many generalizations in dynamics [May91, Rue02, Rue94, SeCh67]. 

(2) The solution of second order differentia! equations with the use of a 2 x 2 prop- 
agator matrix [PaPo90]. 

(3) Complex dynamics, iterated substitution of rational functions [Sch1871, FrLM83]. 

(4) Symbolic dynamics: Iterated substitution from words to letters, random products 
of Perron—Frobenius matrices [Jor01a, Per07, Wal01, Bal00, Rad99]. 

(5) Statistical mechanics and phase-transition problems [Rue69, Bal00]. 

(6) The solution of diffusion equations using Wiener’s path measure [Sim79]. 

(7) Analytic continuation of Wiener’s solution in (6) to the solution of Schrédinger’s 
equation; again based on path measures, but now the (ill-defined) Feynman mea- 
sure [Nel64, Nel69]. 


More recent applications of infinite random products include: 


(8) piecewise linear iterated function systems (IFS) [BrJo99a, DeSh04, Shu04, 
Shu05], fractals and chaos [Rue94]; and 

(9) wavelets [Dau92, BrJo02b, CoHR97, CoRa90, Gun00, DoGH00, GuKa00, Gun04, 
PaSW99]. 


Even though general methods from probability, random walk, transition proba- 
bilities, and path space have a long history in analysis and in applications, their use 
in wavelet analysis is of rather more recent vintage. 

Ubiquitous to our present approach is a certain “transfer operator” R, see [Bal00] 
and [BrJo02b]. This operator in fact has many incarnations (and many names). It has 
emerged and re-emerged, over the years, in a variety of applications. The underlying 
idea behind it is clear from matrix theory, in the guise of a positive matrix P and the 
familiar Perron—Frobenius theorem on the spectrum of P. But we now address an 
infinite-dimensional setting. 

Our approach is motivated by David Ruelle’s use of an infinite-dimensional vari- 
ant of R in the 1960s. Ruelle used R in his study of phase-transition problems of 
quantum statistical mechanics. Since then, other variants of R have re-emerged in a 
variety of different applications: in dynamics, continuous and discrete, experimental 
and symbolic; and in our (limited) understanding of fractals and other attractors! The 
operator R is used in such applications as wavelets and fractals, as well as in pure 
mathematics (zeta functions, trace formulas, etc.). 

This book is actually focused around (9) from the list of topics above, i.e., the 
kind of random-walk problems (see, e.g., [Spi76, DiFr99]) that are associated with 
analysis of wavelets [Dau92, BrJo02b] and with iterative algorithms for wavelet 
packets [Wic93, Wic94]. 
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References and remarks 


Historically the meeting ground of probability and mathematical analysis has been in 
the areas of potential theory and stochastic differential equations; see, e.g., [Bas95], 
[Bas03], [Doo94], and [Doo01]. A classic textbook in probability theory for the be- 
ginner is Feller’s [Fel71], and it is still a pleasure to read. 

Recent papers which build on the interconnections between wavelets and fractals 
on the one side and probability and signals on the other include [HeJL04], [Jor0 1a], 
[DuJo06b], [BaJMP04], and [Gun66, Gun99, Gun00, DoGH00, GuKa00, Gun04]. 

In contrast to traditional textbooks and research articles, there is now a new in- 
formation system, Wikipedia, which offers a delightful and student-friendly invi- 
tation to some fundamental wavelet ideas and wavelet packets. One of the entries 
from Wikipedia for the keyword “wavelet” [WWW/wikil], also available as the paper 
[ViMu95], contains both delightful exposition and Mathematica do-it-yourself pro- 
cedures. In addition it reaches out to applications (e.g., to hands-on experiments with 
thresholding), and to neighboring fields such as probability. The reader is likely to 
find other helpful and relevant material in [WWW/wiki]. Since it is a “community 
site,” not everything is up to date; but if used with care, it can be really helpful—and 
free! 
Turning to some fundamentals of probability theory, the novice might find the 
lovely survey article [Stro96] helpful. It is a bird’s-eye view on path-space measures 
from the Gaussian viewpoint and their applications. See also [Stro00]. Both treat- 
ments include many useful and modern references. Further, the same author D.W. 
Stroock has an attractive new book [Stro05] giving a concise introduction to Markov 
processes. 

‘Our present focus is different: We stress certain algorithmic and potential- 
theoretic features of the theory of wavelets, fractals, and iterated function systems 
(IFS). Our emphasis is more on the discrete side and on developments that have 
taken place in the past two decades, but they have been motivated by the more clas- 
sical approaches to continuous martingales and stochastic analysis that grew out of 
probabilistic methods for the heat equation and its cousins in applied mathematics. 
One approach to discrete harmonic analysis is via quadratic forms and resistance in- 
equalities; see, e.g., [Kig01]. Our viewpoint is related to this, but different in that it is 
based on the analysis of a certain transfer operator rather than on energy forms as in 
[Kig0 1], i.e., on certain quadratic forms with resistance numbers relating transitions 
between states on a graph configuration. 

As outlined in Section 1.3 in this chapter, Mallat’s approach to multiresolutions 
[Mal89] involves three mathematical tools: Fourier transform, infinite products, and 
the analysis of periodic functions. In several variables, i.e., functions on R? this last 
step in turn relies on classical Fourier duality between the familiar rank-d lattice Z4 
on the (multi)frequency side and the compact torus R¢,//Z4 on the other. Now the 
quotient R¢/Z? may be represented as the d-torus T@, but also in a variety of other 
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ways as well. Specifically, we may represent it as a suitably selected subset of R?, 
for example as the usual d-cube D. The other choices for D are called fundamental 
domains for the lattice Z4. Whichever the choice of fundamental domain, we arrive 
at the Hilbert space L? (D) and the following fact: The functions e, (x) := e/2**+ 
with the vector index 4 in Z? form an orthogonal basis for the Hilbert space L? (1D); 
i.e., an ONB when normalized. 

There is a variation of this theme which starts with the observation that the pair 
of sets (D, ZA) is a spectral pair in the sense of [JoPe93]. Specifically, consider 
two subsets D and A of R? with D having positive and finite Lebesgue measure. 
We say that it is a spectral pair if the functions { e, (x) := e?"*" | 4 € A} form 
an orthogonal basis for L? (D). The e; functions are restricted to D, and L? (D) 
is defined relative to Lebesgue measure, but there are important variations of the 
definition (see, e.g., [JoPe96]) when other measures are used, and we shall return to 
this theme in Chapter 4. 

The joint papers by S. Pedersen and the present author, e.g., [JoPe92], were moti- 
vated in part by an influential paper by B. Fuglede [Fug74]. An important observation 
in [Fug74] is that the spectral pairs (D, A) for which A is a group are precisely those 
when D is necessarily a fundamental domain for a rank-d lattice I’. In that case, A 
may be taken to be the lattice which is dual to [. 

There are a number of other papers which follow up on this general theme. Of 
these perhaps [ACM04] and [CaHM04] and the references cited therein are espe- 
cially relevant. 

In a series of papers (starting with [GaNa98b]) J.-P. Gabardo and his coauthors 
have taken up the idea of using spectral pairs as the basis for wavelet constructions 
which are then based on “non-uniform multiresolutions.” The main theme in this 
approach is the use of general spectral pairs (possibly with irregularities, and with 
inhomogeneities) in the same way Mallat’s traditional multiresolution approach uses 
Fourier’s spectral pair (d-cube, Z“). This theory follows the general theme in this 
book, but the details are outside the scope of a moderate-size book; so interested 
readers are referred to [GaNa98b, GaYu05]. Here we only emphasize that the ap- 
proach is especially well suited to the analysis of fractals with multiresolutions. They 
are taken up in Chapters 8 and 9. 

Before embarking on the details in the chapters to follow, the reader may want to 
first consult some relevant references (books and papers) covering measures [Bil99, 
Jor04b, FrLM83], probability theory [Kol77] [Wil91], sampling [Sha49, AlGr01], 
and wavelets [AyTa03, Coh90, BrJo02a, BrJo02b, Dau92, Jor05], and [CoRa90}. 

As for the general theory of random processes, we recommend the reader study 
the books by Skorohod [Sko61, Sko65]. A more modern approach, stressing martin- 
gales, may be found in the two books by J. Neveu [Nev75, Nev65]. A (small) sample 
of the rich variety of applications includes the papers [Hug95, G1Zu80, Mon64]. 

The function / in (1.3.10) and variations of it depend on the chosen space X 
and the given weight function W. We outlined this function above, and it will again 
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play a central role in Chapter 3. Here we stress its connection to Kolmogorov’s zero- 
one law [Sat99, Theorem 1.14]. Recall that Kolmogorov’s 0-1 law applies to a given 
probability space (Q, P, F) where P is a probability measure defined on a a -algebra 
F in Q: Let (F,,) be an independent and countable family of sub-c-algebras in Q, 
and let A be a subset of Q. If, for each n, A belongs to the tail-c-algebra generated 
by Fn, Fn+1, Fn42; ---, then it follows that P (A) must be zero, or one. 

There are three features of this: 


(1) We view the integers Z as canonically embedded in the probability space Q of 
(1.3.6). 

(2) When our weight function W is given, we then get an associated family of mea- 
sures P, on Q. (In our applications to wavelets, W will be |mo|* where mo is 
some low-pass wavelet filter.) 

(3) For every x in X, we may apply the zero-one law of Kolmogorov to the prob- 
ability space (Q, P,). Let x be given. If there is an independent and countable 
family of c-algebras (F;,) on Q such that Z belongs to all the tail-o-algebras of 
(Fn), then it follows that h (x) = P, (Z) must be zero or one. 


We now turn to a closer study of the measures P, in Chapter 2, and we outline 
their dependence on the prescribed weight function W. 

In addition to the papers and books on wavelets, sampling, and signal processing, 
cited inside Chapter 1, readers might wish to consult one or more of the following 
treatments. They serve to supplement, in one way or the other, the point of view 
taken in this book: [Gro01, HeWe96, JaMe96, JaMRO1, Mal98, MeCo97, StNg96, 
Waln02]. 

In the chapters to follow, we will stress interconnections between several trends 
in the subject, i.e., we emphasize how tools from analysis (including operator theory 
and operator algebras) and probability theory are used in wavelets, fractals, and dy- 
namics. The part of operator algebras that we have in mind is often called non-com- 
mutative probability, and it is concerned with representations of algebraic structures 
by operators in Hilbert space: indeed, this is the key to their usefulness in wavelets, 
fractals, dynamics, and other areas as well. 

In the non-commutative setting, the analogue of a standard probability measure 
is called a state. In mathematical terms, a state is a positive linear functional on an 
algebra of operators. A normalization condition is usually added to the definition. 
This viewpoint (see especially Chapter 9) is of course motivated by Riesz’s theorem, 
see [Rud87], and quantum physics. Getting a concrete problem represented in Hilbert 
space is the key to the use of such tools as spectral theory; see, for example, the 
following papers where the present author has had a hand: [BaJMP06, DuJo05b, 
DuJo05a, BrJO04, Jor04a, BaJMP05, JoKr03, Jor01c, Jor01b, BrEJOO, BrJK W00, 
BrJo99b, Jor99, BrJo97]. But, as will become clear later, each of our themes will 
have its connection to Hilbert space in one form or another, and more citations will 
follow. 


2 


Transition probabilities: Random walk 


The Cat only grinned when it saw Alice. ... 

“Cheshire Puss,” she began, rather timidly, as she did not at all know 
whether it would like the name: however, it only grinned a little wider. 
“Come, it’s pleased so far,” thought Alice, and she went on. “Would you 
tell me, please, which way I ought to go from here?” 

“That depends a good deal on where you want to get to,” said the Cat. 

“I don’t much care where—” said Alice. 

“Then it doesn’t matter which way you go,” said the Cat. 

“so long as I get somewhere,” Alice added as an explanation. 

“Oh, you’re sure to do that,” said the Cat, “if you only walk long 
enough.” —Lewis Carroll 


PREREQUISITES: Curiosity about an idea of Kolmogorov; a vague recollection of 
the Stone—WeierstraB theorem; a rough idea about probabilities, and measurable sets; 
comparing measures; having encountered the spectral theorem in its simplest form. 


Prelude 


A key link between wavelets and fractals on the analysis side and random walk on 
the probability side is to be found in the use of filters from signal processing. For 
the standard dyadic wavelets on the real line, we already sketched this approach 
in Chapter 1. Stepping back and taking a more general and systematic view of the 
underlying idea, one sees that in a real sense it is (almost) ubiquitous in both pure 
and applied mathematics. 

Originally “filters” were introduced in their most primitive form by Norbert 
Wiener and Andrei Kolmogorov in the 1930s and 1940s for use in a variety of applied 
problems involving information and time series. Since then and up to the present, this 
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has been further refined into an art form by engineers with a view to transmission of 
speech signals. The first step in the refinement is an identification of so-called fre- 
quency bands, for example a subdivision into two bands, say high or low, or more 
generally, a subdivision of the whole frequency range into a fixed finite number of 
bands, say N bands. 

In image processing (as used in digital cameras), the same idea, suitably modi- 
fied, yields instead arrays of visual resolutions. Then it is natural to use four filters, 
i.e, N = 4, corresponding to recursive and iterated subdivision of squares (called 
pixels). 

One approach to the analysis of the signal/image at hand may be understood via a 
random-walk model, i.e., random walk on a combinatorial tree with a suitable N-fold 
branching. Thought about this way, one sees that the basic idea is indeed ubiquitous, 
and in particular that it is further relevant for the kind of geometric self-similarity 
notions which go into our understanding of fractals. 

We will adopt this view here, and note that these more general filters are then 
prescribed by certain functions which “assign probabilities” to branch points, or the 
bands, i.e., assign probabilities to the N “choices” at each step in the “walk.” In the 
case of speech, the good filters from engineering are those that produce perfect recon- 
struction of output signals from a synthesis of subbands. Surprisingly, these carefully 
designed subband filters are the very same ones which may be adapted successfully 
to mathematics problems, and which, among other things, produce efficient wave- 
lets. If there are only two bands, the filters are known as quadrature-mirror filters. It 
is intriguing to take an even wider view and to compare the subband study of wave- 
lets and fractals to the familiar positional number system dating back to the Arabs 
of ancient time: We all know that the expansion of a number in base NV amounts to 
specifying a string of “digits,” in this case the “digits” are selected from the possi- 
bilities 0, 1, ..., MW — 1. In the most elementary case, that of the natural numbers, 
this is accomplished by a repeated application of Euclid’s algorithm; and we are here 
attempting to imitate the idea, now with a probabilistic twist, and in a varied array 
of applications, ranging from wavelets to fractals and from speech signals to image 
processing. 


2.1 Standing assumptions 


We now turn to the construction of the random walk and the transition probabilities 
in the general context of a measure space (X, 3) and a fixed N-to-1 measurable 
mapping o. 

Let X be a set, and let B be a fixed o-algebra of subsets of X. Let 
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o:X 3X 


be an onto mapping which is assumed B-measurable, i.e., for all B € B, the inverse 
image 
ao! (B) i={xeXl|ao(x)e B} 


is again in B. Let uw be a probability measure defined on B. We shall assume that the 
singletons {x}, for x € X, are in B, but the measure yw need not be atomic. We shall 
further assume that 

forallx eX, #07! ({xJ=N, (2.1.1) 


where N > 2 is fixed (and finite). 
Let W: X — [0, 00) be given, and assume that 


(i) W is measurable, i.e., that W~! (J) € B for all intervals J in [0, 00), 
and further that 


Gi) >) WO)K<lyaexeX. 


yex, o(y)=x 
In view of (2.1.1), the sets o~! ({x}) may be labeled by 
Zn = {0,1,...,N-]} =Z/Y/NZ. 


Since the singletons are in 8B, we may pick measurable branches of the (set-theoretic) 
inverse o~!, i.e., measurable maps 1;:.X > X,i =0,1,..., N — 1, such that 


oot; =idy, i=0,1,...,N—1. (2.1.2) 


Using W, we may then define a probability of a transition, or walk, from x to one of 
the points (7; (x))o<j<w ino! (x), by 


P (x, tj (x) := W(t (&)). (2.1.3) 
Note that by assumption (ii), 


DP&aG@y= Di WO)<1. 
i o(y)=x 


The fact that we allow “<” rather than “==” in (ii) is a way of including the possibility 
of dissipation in our random-walk model. For details, see Remark 2.8.3 below. 
2.2 An example 


Example 2.2.1. Farey trees. Let Y = U?°., X; where X; is the set of monotoni- 
cally increasing sequences of those continued fractions 
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between 0 and 1 such that a; e N for 1 <i < k, ag, > 2, and pay aj =n+2.The 
number of terms in X;, is 2”, and two branches ty and 7; are defined by 


to([...,@a@)p=[...,a—1,2], 
m1(Q...,apas[...,a+1]]. 


Note that each 7; maps X, into X41. Ifx © X, we say that tox and 71x are the 
two daughters in the binary tree. As an example, consider Figure 2.1. These systems 
define random walks, and are studied in connection with circle maps, dynamics, and 
gap labeling in statistical mechanics; see [Hal83] and [CvSS85]. 
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2.3 Some definitions: The Ruelle operator, harmonic functions, 
cocycles 


The next definitions prepare us for the statements and proofs of Lemma 2.4.1, giving 
the existence and uniqueness of the measures P,, and Theorem 2.7.1, giving cocycles 
as boundary values of bounded harmonic functions. 


Definitions 2.3.1. (a) The Ruelle operator R = Ry is defined by 


RA@= >) WOSO), xEeX, fel™(X%), (23.1) 


yex, a(y)=x 


and maps L©™ (X) into itself. 
(b) Let Q be the compact Cartesian product 


Q=Zy =(0,...,.N-}N=[](0,...,N-1. (2.3.2) 
1 


(c) A bounded measurable function V:.X x 2 — C is said to be a cocycle if 


V (x, (ai, a2, ...)) = V (to, (*), (@2, @3,...)) (2.3.3) 


for all @ = (@1, @2,...) € Q. 
(d) A function h: X — C is said to be harmonic, or Rw-harmonic, if 


Rwh=h. (2.3.4) 
(e) Let € N, and let 71, ..., i, € Zy. Then the subset 
A(ij,.-.sin) = {we Qloa, =ih, ..., On = in } (2.3.5) 


is called a cylinder set. 


2.4 Existence of the measures P,. 


The cylinder sets generate the topology of Q and its Borel o-algebra. In determining 
Radon measures on Q, it is therefore convenient to first specify them on cylinder 
sets. This approach was initiated by Kolmogorov [Kol77]; see also Nelson [Nel69]. 
Recall that Q is compact in the Tychonoff topology, and that we may use the Stone— 
Weierstra} theorem on C (Q) = the algebra of all continuous functions on Q. 


Lemma 2.4.1. Let X, B, u, o, to, ..., tN-1, and W be given as described above. 
We make the following more restrictive assumption on W: 
DS WO=1 aexeX. (2.4.1) 


yex, o(y)=x 
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Then for every x € X there is a unique positive Radon probability measure P, on Q 
such that 


P, (A (i, .--5in)) = W (t,) W (c1,t4,%) 1 W (tin see Tj, X) ‘ (2.4.2) 


Remark 2.4.2. It turns out that the general case ae +++ < 1 (assumption (ii) above) 
may be reduced to (2.4.1). So (2.4.1) is not really a restriction. This is discussed in 
Remark 2.8.3 below. 


Proof of Lemma 2.4.1. If P is a Radon measure on Q, we set 


P(f] = Ft (@) dP (@) forall fe CQ). (2.4.3) 
Q 
Set 


Cin (Q) = { f € C(Q) | An such that f (w) = f (@1,...,@n), (2.4.4) 


i.e., f depends only on the first 7 coordinates in Q} 
[o-e) 
U Ay. 
n=1 


Note that (2.4.4) defines Cgn (Q), but may also be regarded as an implicit definition 
of 2,,, functions depending only on the first n coordinates. The algebra Cgn (Q) is 
defined as a union over n, which as such is independent of n, and the argument is that 
this union is dense in C (Q). Considering 2,, in the definition (2.4.4) for increasing 
values of the index n, we get an ascending nest of subalgebras of C (Q), 


II 


{| 


My C Ay C++: CA, C Anyr C---. (2.4.5) 


An immediate application of Stone—WeierstraB shows that Cay (Q) is uniformly 
dense in C (Q), i.e., 


ve =C(Q), 


n=1 


where ~ stands for norm-closure. Let x € X, and f € Cg, (Q). Suppose 
f (@) = f (@1,.--,@n), 
and set 


PrLFT= >) W (tax) +++ W (Crm Tor) F(@1,--+@n). (2.4.6) 


(@1,..,On)EZH 


Note that if there is some h € L° (X) such that 


Sf Qi, .-., On) = h (ta, --+ ta)» 
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then 


Pel f= 2 W (tax) +++ W (tery ++ Tey) (Toop ++ Teor) (2.4.7) 


(@1,--,@n) 


W (o"-!y) -- Woy) W)hQ) 


yex,o"y=x 
(Rjyh) (2). 


We now show that P, [ f] is well defined. This is the Kolmogorov consistency: 
we must check that the number P, [ f ] is the same when some f € 2A, (C A%n41) is 
viewed also as an element in 2,41. Then 


Sf (@) = f @1,..., @n) 


= f (@1,...,@n, Ont1), 


i 


and 
Py [ fo ] = re W (t,x) pa W (ton eine TX) S (@1, sree @n+1) 
@],---,On+] 
= Ss W (ta, X) +++ W (Toy ++ TX) 
1 5.--,On 
( > W. (Gonas tay, © aby) )ree ves @n) 
On+1 
eed 
=1 by (2.4.1) 
= oF W (ta,X) +++ W (tay ++ Ta, X) f (@1,-++5 On) 
@1,...,On 
= Py [fr], 
as claimed. 


The consistency conditions may be stated differently in terms of conditional 
probabilities: for f € C (Q), set 


POLS = PLS 1 Sn] (2.4.8) 
= >, W (TanX) +++ W (Toy ++ TX) f (1, «+ +5 On). 
(@j,...,@n) 
We proved that 


PO LF = POLS] for all f € Ay. 


Using now the theorems of Stone—Weierstra8 and Riesz, we get the existence of 
the measure P,; on Q. It is clear that it has the desired properties. In particular, the 
property (2.4.2) results from applying (2.4.6) to the function 


46 2 Transition probabilities: Random walk 


F @) := 65,0 ++ * Sinsons weQ, (2.4.9) 

when the point (1, ..., in) is fixed. 
These functions, in turn, span a dense subalgebra in C (Q) (by Stone—Weier- 
straB), so P, is determined uniquely by (2.4.2). Oo 


2.5 Kolmogorov’s consistency condition 


For general reference, we now make explicit the extension principle of Kolmogorov 
[Kol77] in its function-theoretic form. 


Lemma 2.5.1. (Kolmogorov) Let N > 2 be fixed, and let 
Q=(0,1,...,N—UN. 


Forn=1,2,..., let 
P®:%, 9 C 


be a sequence of linear functionals such that (i)Hiii) hold: 


(i) P™ [1] = 1, where tl denotes the constant function | on Q, 
(ii) f € An, f > 0 pointwise > P™[ f] > 0, 


and 
(iit) P [ fF] = P@+Y | f | forall f € An. 
Then there is a unique Borel probability measure P on Q such that 
PLfJ=P™[f], fe. (2.5.1) 
Specifically, for P, we have the implication result 
f €CQ), f = 0 pointwise > P[ f]>9. (2.5.2) 


Remark 2.5.2. Here we have identified positive linear functionals P on C (Q) with 
the corresponding Radon measures P on Q, i.e., 


Pisi= f rae. (2.5.3) 


This identification P <> P is based on an implicit application of Riesz’s theorem; 
see [Rud87, Chapter 1]. 


Proof of Lemma 2.5.1. The proof of Kolmogorov’s extension result may be given 
several forms, but we note that the argument we used above (in a special case), based 
on an application of the Stone—WeierstraB theorem, also works in general. Oo 
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2.6 The probability space Q 


We note that the probability space Q itself carries mappings o and 1;,i = 0, 1,..., 
N — 1, satisfying the conditions (2.1.2): stressing the Q-dependence, we write 


o® (@) = (@2,@3,...) and a (@) = (i, m1, @2,...) (2.6.1) 


for @ = (@1,@2,...) €Q. 


Remark 2.6.1. The connection between the cylinder sets in (2.3.5) and the iterated 
function systems (IFS) CX, 0, to, ..., Ty—1) may be spelled out as follows: the cylin- 
der sets in Q generate the o-algebra of measurable subsets of Q, and similarly the 
subsets tj, --- t;, (X) C X generate a o-algebra of measurable subsets of X. When 
nothing further is specified, these will be the o-algebras which we refer to when 
discussing the measurable functions on Q and_X. In particular, we will denote by 
M (Q) and M (X) the respective algebras of all bounded measurable functiond on 
Q, respectively _X. 
Note that if Y = [0, 1] and 


0<i<N-1, 
then we recover the familiar N-adic subintervals: 
+1), (XY= E +: ee Ht tae a | (2.6.2) 
Lemma 2.6.2. There is a unique mapping p: M (Q) — M (X) which satisfies 
p(f8) =p Pp ls) (2.6.3) 


and 

P (Atirsnisy) = Xeiy--tig QD (2.6.4) 
The mapping p is an isomorphism of M (Q) onto M (X). 
Proof. Recalling (2.4.9), we note that 


XAG, in) (o) = 5i; ,c01 a Sinn > woe Q. 


presg 


As a result, 
XAG seorin) XAG poerdn) = Od Finn HAG pevsin)* (2.6.5) 
We then define p first on 2,, by 


(> Giy,.. oinX A(y,... =, 2 Qiy,..., inX ty, “Tin (XD)? 


ohn 
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where aj,,...,i, © C, and note that (2.6.3) is satisfied. 

It is easy to check that the extension of p from 2, to 2,4; is consistent. The 
final extension from LJ, %n to M (Q) is done by Kolmogorov’s lemma, and it can be 
checked that p has the properties stated in the conclusion of the present lemma. 0 


Lemma 2.6.3. Let X, W, and N be as described in the beginning of this chapter, and 
let { P, |x € X} be the process obtained in the conclusion of Lemma 2.4.1. Then 


N-1 

DW (ax) Pax fG, I= PLS] forall feCcQ). (2.6.6) 

i=0 
Remark 2.6.4. Stated informally, formula (2.6.6) is an assertion about the random 
walk: it says that ifthe walk starts at x, then with probability one, it makes a transition 
to one of the N points to (x), ..., Tv—1 (x). The probability of the move x > 1x is 
W (t,x). Recall (2.4.1) asserts that 5°, W (tix) = 1. 


Proof of Lemma 2.6.3. It follows from (2.4.6) and the arguments in the proof of 
Lemma 2.4.1 that it is enough to verify (2.6.6) for f € Can (Q), or for f € Ay. 
Let f € A,. Then 


DW (ax) Pox Lf G -)] 
-> > W (1x) W (te, tiX) -*- W (Go * Ta TX) f (i, 1, «5 @n) 


ED) ,.+-5On 


PALS), 


by (2.46) 
which is the desired conclusion. Recall Cgp (Q) is norm-dense in C (Q). o 


Remark 2.6.5. Note that the formula (2.6.6) generalizes the familiar notion of self- 
similarity for measures introduced by Hutchinson in [Hut81]; see also (4.1.4) below. 
In fact, (2.6.6) may be restated as 


N-1 


S” W (tix) Pus 0 (()" = Pr. (2.6.7) 
i=0 


2.7 A boundary representation for harmonic functions 


The next result, Theorem 2.7.1, is a variant of the classical Fatou—Primalov theorem. 
But in the present context, it has cocycles as boundary values of certain bounded 
harmonic functions—the notion “harmonic” is here defined from the Ruelle operator 
R of Definition 2.3.1(a). We obtain our cocycles as an application of the familiar 
convergence theorem for bounded martingales. 

We then resume our study of harmonic functions in Chapter 6 below. 
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Theorem 2.7.1. Let X, W, and N be as described in the beginning of this chapter, 
and suppose in addition that (2.4.1) is satisfied. Let R = Ry be the Ruelle operator 
on L™ (X), and let { P, | x € X} be the process on Q from Lemma 2.4.1. Then there 
is a 1-1 correspondence between the bounded harmonic functions h and the cocycles 
V as follows: If 

VIX xQ5C 


is a cocycle (bounded and measurable), then 
h(x) =hy @®%) = Px [V @, -)] (2.7.1) 


is harmonic, i.e., 
Rh=h. (2.7.2) 


Conversely, V may be recovered from h as a martingale limit. 


Proof. (2.7.1) = (2.7.2). This follows by Lemma 2.6.3 and the cocycle property. 
Let V be a cocycle, and define h = hy by (2.7.1). Then for x €_X, 


(Rh) (x) = Ds W (tix) h (t;x) = >e W (14x) Pox [V (tix, -)] 
DW (ix) Pax [V @i-)] 


(the cocycle property) ; 


Py[VQx,-)], = A(x), 


by 2.6.6) by (2.7.1) 


where the summations are over i € Zy. This proves (2.7.2). 
The converse, (2.7.2) = (2.7.1). Let h be a given harmonic function on X. Let 
n & N, and define 


Ls F1= ff @yh(toy---tor*) Pe), fm 273) 


An application of (2.7.2) and the argument above show that L, is consistent, i.e., 
using f (w) = f (@1,..-,@n) = f (@1,.-., @n, On41), We get 


i fh (Cains ee T,X) dP, (a) 
Q 
= i f (@) (RA) (Toy +++ T,X) APx (@) 
wy Tray [pF OH (Com tort) APs (0) = Le LF, 


which shows that Kolmogorov’s consistency condition holds. 
Hence there is a Radon measure on Q, also denoted L,, such that 


Li(/) =f fats, fec@). (2.7.4) 
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Absolute continuity is immediate, i.e., 
dL, < dPy. 
Let V (x, -) be the Radon—Nikodym derivative, i.e., 
dL, 


VOa,-)= aP, on Q. (2.7.5) 
It then follows from (2.7.3) and (2.7.4) that V is a cocycle, i.e., that it satisfies the 
conditions in Definition 2.3.1(c). oO 


Remark 2.7.2. It is useful to think of Theorem 2.7.1 as an analogue of the classical 
Fatou—Primalov theorem [Rud87, Chapter 11] about the existence of boundary func- 
tions for bounded harmonic functions. Recall that every bounded harmonic function 
h in the disk X¥ = {x eC | |x| < 1} is the Poisson integral of a function v on the 
boundary of X, viz., the circle S = {x e C | |x| = 1}. Specifically, if P, forx eX 
denotes the Poisson measure, then the analogue of (2.7.1) is 


h(x) = Px[v], 


or 


he) = [ow dP, (0), 


or more explicitly in polar coordinates, 


- a 1-—r?)d 
(reat foe) GE oo. 


where v (e!?) = lim,_,1 A (re’®) ae. 0. 

Moreover, see [Rud87, Theorem 11.22], v (@) is the non-tangential limit of the 
given harmonic function, for points x tending non-tangentially to w. 

Hence in the present analogy, our probability space Q is the analogue of the 
boundary, and our cocycle V is the analogue of the limit function ov. 

Using the notion of martingales, see [Wil91], we may recover V from A the 
following way. Let 

X; (@) = ox for w € Q, 


and set 


Zn (X, @) = TX, (w) ** * TX\(w)X 
= Ten, ** TeX 
and note that Z, is a Markov chain with transition probability P, at x. If h satisfies 


(2.7.2), i.e., is R-harmonic, then h (Z,, (x, -)), m = 1,2,..., is a bounded martin- 
gale, and so it converges pointwise P,-a.e. on Q, i.e., the limit 
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Vx,-):= lim h(Z,(@,-)) on Q (2.7.6) 
n->oo 


is well defined. It is immediate from (2.4.7) that this V is a cocycle, and that it 
satisfies (2.7.1). 

A good reference on the standard facts about martingales, including the martin- 
gale convergence theorem, is [RoWi00, II.48, pp. 146-148]. The martingale property 
for h (Z, (x, @)) amounts to the identity 


Py (A (Zn (x; +)) | An—-1) = A (Zn-1 (&, -))- (2.7.7) 


This is the defining property of a martingale, and to verify it we proceed as follows. 
Using the identity Rywh = h, we get 


Px (h (Zn (5 +)) | &n—1) = DW (to, X) Ht (Top Toy *** TX) 
@n 
os (Ryvh) (Fi 4 ares Ta, X) 
=h coe tape TeX) =h (Zn-1 (x, . »)) ’ 
which is the desired identity (2.7.7). 


Corollary 2.7.3. The 1-1 correspondence between the harmonic functions h in The- 
orem 2.7.1 and the cocycles V is an order-isomorphism when the ordering of both 
families of functions is defined in the pointwise sense. 


Proof. We noted that when V is a given cocycle, then the corresponding function h 
is defined by (2.7.1); and conversely, V may be computed from / via the formula 
(2.7.6). 

Since each P, is a positive measure, (2.7.1) yields the implication 


Vi< Vo = hy, <hy,. 


Conversely, if 1, 42 are given harmonic functions and V;, V2 are the correspond- 
ing cocycles, then 
0<h <maN <b. 


This second implication is immediate from (2.7.6). o 


Corollary 2.7.4. Let v:Q — C be a bounded measurable function, and let X, W, 
and R = Ry be as stated in the theorem. Then 


V X,@):=v(@), ae€eQ, (2.7.8) 
is a cocycle if and only if v is invariant, i.e., 
voo@ =v. (2.7.9) 


In particular, if v is invariant under the one-sided shift on Q, then 


52 2 Transition probabilities: Random walk 
h(x) = P,[v] (2.7.10) 
is harmonic. 


Proof, Let v:Q — C be given, and suppose V in (2.7.8) is a cocycle. Then 
v (w) = V (x, @) = V (tax, (a2, 03,...)) = 0 (@2, @3,...) =0 (6% (@)), 


which is the desired conclusion (2.7.9). The converse implication is immediate. 
If h is defined by (2.7.10) for some shift-invariant v, then it follows from Theo- 
rem 2.7.1 that A is harmonic; see the proof of (2.7.1) = (2.7.2). a] 


Definition 2.7.5. We say that a subset E C Q is shift-invariant if (0 Q)-I E=E. 


In view of (2.6.1), 
N-1 


(0% "B= | J P(e), 


i=0 
so shift-invariant sets E satisfy Zy x E = E. For example, 


E:=|)G@n x +++ x Zy x {0}) 
—— eee 
m n times 


is shift-invariant. Recall that there is a canonical bijection, see (1.3.8), between this 
set E and the natural numbers No = {0, 1,...}. 


Corollary 2.7.6. Let E C Q, and let X, o, tj, N, and W be as described in the 
beginning of the chapter. Set 


he (x):= Px[xe], (2.7.11) 


where x , denotes the indicator function of E. 
Then hg is harmonic, i.e., satisfies Ryhr = hg ifandonly if E is shift-invariant. 


Proof. This follows directly from Corollary 2.7.4 above, if we note that 


Q _ 
XE Ooo ~ (gay g Oo 


2.8 Invariant measures 


Definition 2.8.1. One of the uses of the functions / from (2.7.2), i.e., the harmonic 
functions, is that they determine invariant measures v on X, i.e., measures on X 
invariant under the endomorphism 0: X — X. We say that v is o-invariant, or 


simply invariant, if v oo~! = v, or 


v (o-! (2)) —v(B), BEB. (2.8.1) 
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We say that v is R-invariant if 


[rra= [te (2.8.2) 


for all bounded measurable functions f on_X. 
Proposition 2.8.2. Let v be R-invariant, and let h satisfy Rh = h. Then the measure 
dvyz:=hdv (2.8.3) 


is invariant. 
Conversely, if v1 is a B-measure on X which is assumed o-invariant, and if 


d 
vy < v, then the Radon—Nikodym derivative h = — satisfies 


Rh=h v-a.e. on X. (2.8.4) 


Proof. (Part one!) 


[foedn= [ fooha 


= [ Rifeok) dv= ff Rhav 


= [ shav= [ fan, 


which implies that v, is o-invariant. 
To prove (2.8.4) under the assumptions in the second part of the proposition, let 
f be a bounded B-measurable function on X. Then 


[ fRnav= f RUfooyhldv= | (foayhar 


= | (fo0) dn = f fan =f fhav, 


where all integrals are over _X. As f is arbitrary, the conclusion (2.8.4) follows. O 


Remark 2.8.3. Suppose there is some probability measure v on_X satisfying 
vRy =v. (2.8.5) 


Then the assumption (ii), 
> Yost, (2.8.6) 
o(y)=x 
may be reduced to the special normalization (2.4.1) by the following argument. As- 
suming (2.8.6), then the sequence 
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in(e):= SY) W(o™"0))-- HO O)WO) (2.8.7) 


o"(y)=x 


is monotone decreasing. If a probability measure v exists satisfying (2.8.5), then 
v (hy) = i h,dv =1 for all n, 
x 


and the limit function 
h(x) := infh, (x) = lim hy (x) 
n n->00 
is measurable, and satisfies 
v(h)=1 and Rywh=h. (2.8.8) 
Since W (x) h(x) < h(o (x)), it follows that the modified W-function 
W (x)h(x) 
h(a (x)) ’ 


is well defined, and satisfies the special normalization rule (2.4.1), ie., 


D WO=1 aexeX. 


yEXx, o(y)=x 


Wy (x) = x eX, (2.8.9) 


To see this, recall that 


> Wi) = > W(y)hQy) 


Sore ogyex 2 &)) 
1 
= W (yyh 
1 2a OHO) 
i 
"iGo < 
which is the desired identity. 
++———— 


Exercises 


2.1. Verify the details in the argument in the proof of Lemma 2.4.1 for why U,, An 
is an algebra of continuous functions, and for why it is dense in C (Q). 
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2.2. Give a direct and geometric argument for the identity (2.6.7) in Remark 2.6.5. 


2.3. Give three different ways to see that a cocycle V = V;, may be obtained from 
every solution / as specified by (2.7.2) in Theorem 2.7.1. 


2.4, Can the assumption in Theorem 2.7.1 of boundedness on the harmonic function 
h be omitted? 


2.5. Give a version of Theorem 2.7.1 which holds when / is not assumed bounded. 
2.6. Compact operators 


Let J := [0, 1], and let R: 7 x J -> C be a continuous function. Suppose that 
D> DiS Ri. xy) g 20 (2E.1) 
io] 


holds for all finite sequences (¢;) and all point configurations x1, x2,... in J. 
(a) Then show that there is a monotone sequence 11, A2,...,0 < Anti < An < 
- < Aj, such that 4, — 0; and an ONB (gn) in La ), with Lebesgue measure, 
satisfying 


1 
i) R(x, y) gn (v) dy = AnBn (x). 


(b) The spectral theorem: Show that the operator 7’r, defined as 


1 
(Trf) (x) = | Ry) f(y) ay 


in L? (J), satisfies 


CO 
TR = didn Ign) (gnl, 
n=1 
where |g) (gn| is Dirac notation for the rank-one projection onto Cg,. (We say that 
Tr is a positive (or non-negative) compact operator.) 
2.7, Karhunen—Loéve [Ash90] 


Let (Q, B, v) be a probability space, i.e., with v (Q) = 1; and set 


E(Y) =[ Yo dv (w) 


for random variables Y on Q. 
Let X:1 — L?(Q,v) be a random process with X (x) := X (x, -) varying 
continuously in L? (Q). Suppose the following two conditions hold: 


Gi) E (X (x)) = 0, for all x € J, and 
(ii) R (x, y) := E (X (x) XQ) is continuous on J x J. 
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(a) Then show that condition (2E.1) is satisfied, and let (g,) be an associated 
ONB in L? (J). 
(b) Tensor product: Show that there is a sequence (Y,,) such that 


(i) Yn € 1? (Q,v), 
(ii) X (x, @) = > gn (x) ¥n (@) is convergent in L? (1) @ L? (Q, v), and 


n=1 


(ii) E (Pe¥n) = Sen de 


Hint: Set ; 
Ye =f BO) X (x, -) de. 
2.8. Brownian motion 


Notation as in Exercises 2.6 and 2.7: J = [0, 1], and (Q, B, v) is a fixed proba- 
bility space. 

Suppose X:J — L? (Q) is given, and auppose also that it is Gaussian, that is, 
that the family {.X (x) | x € J} of random variables has joint Gaussian distributions. 

(a) Under the stated condition, prove that the random variables { Y, |” € N} in 
Exercise 2.7(b) are automatically independent Gaussian. 

(b) Suppose in addition to the above condition on { X (x) | x € J} that R(x, y) = 
min (x, y) for (x, y) € J x J. Then show that (X (x)) is Brownian motion and has 
the following representation: 


oo sin ((n _ }) nx) 
X (x, 0) = V2 }° ——+—Z, @) — forx el, m€Q, (2E.2) 
where { Z, | n © N} is an orthonormal family of random variables in L? (Q). In 
particular, E (Z,) = 0 and E (Z2) = 1 holds for all n € N. 


(c) With the same assumptions as in (b) above, prove that the expansion (2E.2) 
for a.e. w@ € Q in fact converges uniformly for x € I. 


-1 

(d) Show that 577° | ((n - ) z) |Zn (-)| converges with probability 1. 
2.9. Consider the setting in Exercise 2.8, ie. J = [0,1], and (Q, B,v) a fixed 
probability space. 


(a) Let yjo,x) denote the indicator function of the subinterval [0, x) C J. Show 
that 


R(x, y) = min (x, y) = ( xf0,x) | x10.») )raqy for all x, y el. 


(b) Find the expansion of 7j0,x) in the ONB (v2 sin (nzt)) ‘? and use this to 
ne 
show that 
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lee) 
X(x,-)= v2 ee 
= nn 
where (X (x)) denotes the usual Brownian motion, and where (Z;,),<n is an inde- 
pendent family of Gaussian random variables with E (Z,) = 0 and E (Z2) = l;in 
particular, the random variables (Z,,) form an orthonormal family, different from the 
one in Exercise 2.8. 


2.10. Fractional Brownian motion 


Let H e€ (0, 1), and define 
1 
Ru (&,y) = 5 (x24 + > — Ix — yP*) for x, y € [0, 00). 
(a) Show that Ry (-, -) is positive definite, i.e., that 


DDE R Gn, Xm)En > 0 (finite sum) 


for all 1, &,... € C, and all x1, x2,... € [0, 00). 
(b) Show that H = 3 reduces to the case of the standard Brownian motion 
considered in Exercises 2.8—2.9. 
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—and within a week or two I saw that the noncommutation was really the 
dominant characteristic of Heisenberg’s new theory. —P.A.M. Dirac 


While the Karhunen—Loéve theorem in Exercise 2.7 (pp. 55—56) may be viewed 
as a result in probability theory, we will see in Chapters 8 and 9 that it is also a 
theorem in operator theory. In fact, it may be viewed as “pure” operator theory. A 
similar observation applies to our consideration of Schmidt’s theorem (Exercise 7.11, 
p. 150) and to tensor products in general from Chapter 7. One reason for this double 
life is the familiar mathematical distinction between Hilbert spaces in their concrete 
form (say L?-function spaces, or £2-sequence spaces) and in their axiomatic form. 
We will see that the use of operator theory in its more axiomatic form helps clarify 
the use of tensor products, and this in turn is vital for all the basis constructions we 
shall encounter, wavelets, fractals, and time series from signal analysis. 

Abstract considerations of Hilbert space are facilitated by Dirac’s elegant bra-ket 
notation, which we shall adopt. It is a terminology which makes basis considerations 
fit especially nicely into an operator-theoretic framework: If H is a (complex) Hilbert 
space with vectors x, y, z, etc., then we denote the inner product as a Dirac bra-ket, 
thus (x | y) € C. In contrast, the rank-one operator defined by the two vectors x, y 
will be written as a ket-bra, thus E = |x)({y|. Hence E is the operator in H which 
sends z into (y | z)x. 

The reader will notice from Exercise 2.6(b) that the conclusion of the spectral 
theorem for compact operators takes an especially nice form when expressed with 
Dirac’s formalism. 


No vs. Z 


Every axiomatic (abstract) theory admits, as is well known, an unlimited 
number of concrete interpretations besides those from which it was derived. 
Thus we find applications in fields of science which have no relation to the 
concepts of random event and of probability in the precise meaning of these 
words. —A.N. Kolmogorov 1933 


PREREQUISITES: Periodic functions; power series; Fourier series; examples of 
bases in some function spaces, especially in L?. 


Prelude 


The title of this chapter calls for an explanation: Given a base space X, an endo- 
morphism o of X, and a prescribed weight function W on X, we saw in Section 
1.2 that there is an associated measure P, on the space of infinite paths rooted at 
x; see Figure 1.1 (p. 8) for an illustration. As noted, the function W determines the 
transition probabilities that go into the probability measure P, as follows: For two 
“successive” points y and z on such a path in_X, a transition is possible if o (z) = y, 
and the transition probability is then W (z). If points on the path are further apart, we 
use a natural formula for conditional probabilities. 

In the chapter that follows this prelude we will show that this construction is 
key to our understanding of both geometric and computational aspects of the kind of 
multiresolutions that can be built on X. 

More generally, when path-space measures such as P, are used in analysis, it 
is essential to have detailed knowledge about the “size” of their support. Typically 
path-space measures are supported on a rather “small” subset of the full space of all 
infinite paths. An example of this is illustrated by the canonical embedding of the 
real line R in the dyadic solenoid. Analogously, Section 1.5 shows that our P, path 
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space contains a copy of both No and Z. This takes place via a specific encoding 
where No corresponds to the set of paths which after a finite number of bits finish 
with an infinite string of zeroes. 

In the dyadic case, similarly Z corresponds to the paths which terminate with an 
infinite repetition of the pair 01, i.e., paths starting with an arbitrary finite bit-word 
followed by an infinite string 010101 .... (A systematic study of the encodings into 
paths from general N-adic trees will be resumed in Chapter 8 below.) 

Perhaps surprisingly, the measures P, tend to have “small” support, and more- 
over their support properties are significant for the analysis of wavelets and fractals 
associated with systems (X, W, P,). In this and the next two chapters, we will give 
conditions for when the support of P, is No or Z. And we will show that these cases 
of “small support” are “responsible” for a rather nice harmonic analysis. 

In Chapters 1 and 2, we saw that the convergence of the infinite product (1.3.5) 
depends on the support of the measures P, (see Lemma 2.4.1). In particular, the 
measure P, applied to the two subsets No and Z in Q is crucial. Recall from Remark 
1.4.2 that the integers Z are naturally embedded in the probability space Q. Hence, 
in this chapter, we will study the two functions P, (No) and P,(Z). 


™~ Ts 
~~ tees 
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3.1 Terminology 


Let N € N, N > 2, be given. The cyclic group Zy = Z/NZ will be identified with 
{0, 1,..., NW — 1}, and the operations of multiplication and addition are modulo N; 
or we use the fact that Zy is a ring, and a cyclic group under addition. 
Similarly, we will work with the circle, or one-torus, T, as RZ, and R will 
serve as a covering 
R> R/ZST. (3.1.1) 


In this case, we can take advantage of Fourier duality and use the realization 
Raxywhe* eT. (3.1.2) 


Functions on T may be viewed, or realized, in either one of the following equiv- 
alent ways (i)H{ii). 


(i) Functions £ on R which are 1-periodic, i-e., 
f@+D=f/), x eR. (3.1.3) 


If f is measurable, then (3.1.3) is understood to hold only a.e. 
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(ii) If f is absolutely integrable on T = R/Z with respect to Haar measure, then it 
has a Fourier expansion 


f@=)> ce, (3.1.4) 
keZ 
or 
fOo=) oi - eee", (3.1.5) 
keZ 
where : 
Ck = i f (ee ??t ay = i fz" du(z); (3.1.6) 


the second formula uses (3.1.2), where T is taken to be 
T={zeC||z|/=1}, (3.1.7) 
and where (3.1.7) may serve to define the Haar measure u on T, i.e., 
du (e?**) ~dx on [0,1). (3.1.8) 


If W is a bounded measurable function on T, then the corresponding Ruelle op- 
erator R = Rw may be written in either one of the following equivalent forms: 


(ir) 


net x+k x+k 
@nw= Sw ()r) G19) 
or 
(iir) 
RA@= >) Ww)fw),  zeT, (3.1.10) 


weT, oN =z 
where f is a bounded measurable function on T. The function which is constant 
1 on T will be denoted 1. Our normalization condition (1.2.3) may therefore be 
restated as 
Rywi=t. (3.1.11) 


The equivalence of the two formulations (i) and (ii), or of (ip) and (iiz), also 
yields the following result. 


Lemma 3.1.1. Let W and f be bounded 1|-periodic functions on R. (If the two 
functions are measurable, they will be assumed only essentially bounded, and a.e. 


1-periodic.) 
pit x+k x+k 
> ( )r( 7 ) (3.1.12) 


Then the function 
k=0 


is again bounded and 1-periodic. 
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Proof. Note that the expression in (3.1.12) is the right-hand side in (3.1.9), or equiv- 
alently in (3.1.10), and so it is the function Rw f. We now give the explicit argument 
as to why it is 1-periodic. Note that the individual terms in the sum (3.1.12) have 
period N and not 1. Let 


sc) = SW (224) (244), 


k=0 


Then 


a eae 5) 
= (4) (et) +0(2 tN) -(S2%) 
k=) 


N- 
ye (Hs (=*) = 20. = 
k=0 


3.2 The unit interval 


As before, N > 2, N €N, is given and fixed throughout. We now revert to the setting 
of Chapter 2, but specializing to X = [0, 1 ] = the unit interval. Then the mappings 
o,),..., TN—1 May be specified as follows. 


k (3.2.1) 
nix AS, PaO AiN ei; 


or alternatively, in complex form, 


| o:x + Nx mod 1, 


oz 2X, zeT, 
(3.2.2) 
Th Z t+ the k’th of the N roots, k=0,...,N—1. 
The second line in (3.2.2) may be spelled out as follows: 
Te e™* KS gi2tR+h)/N | (3.2.3) 


Lemma 3.2.1. Let W:[0,1]— [0,1] be a given measurable function and extend 
W from [0,1] to R by periodicity. Then the densities for the probabilities P, in 


(1.2.4) are 
W (to) +++ W (toy © TeX) = w(* **) W Ci) (3.2.4) 


N 
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where 
k=a1+a.N +---+a,N"", (3.2.5) 


and w; € {0,1,...,N—-—l}, 1 <i<n. 
Proof. A direct calculation, using (3.2.1) and the periodicity of W, yields 


x+k 
NS 


W (Cato 4 tqit) w( ) forl<s <n. 


It is understood that the integer k € No is given by (3.2.5). Conversely, every k € No 
has a unique such representation if we agree to let w, be the last (if any) non-zero 
term. The result (3.2.4) follows. Qo 


If # is given, and 0 < k < N” — 1, then the terms @1, ..., @y, in (3.2.5) are 
determined. Let k € Z. Then determine k’,/ € Z such that 


O0<k <N"-1 and k=k' +I1N". (3.2.6) 
This may be done with the Euclidean algorithm. If 
k =a, +a,.N +---+a@,N" 1, 


it follows by Lemmas 1.4.1 and 3.1.1 that 
xt+k x+k 
W (rn) (Soy +> tor8) = W (AE)... w (AEE) 


x+k x+k 
=W Wt}, 
(Sr) Gr) 
As a result, the probability densities Pe” ({k}) are now defined for all k € Z, and 
not just forO < k < N”® —1. 


Proposition 3.2.2. Let N, X =[0,1], W,o, 1, ..., tt-1, Q= {0,...,N — BN, 
and Pe” ({k}) be as described above. In particular, assume that 


N-1 
> (=F) =1 ae.x €[0, 1). (3.2.7) 
k=0 N 
Then it follows that 
(N-1)N"-1 
>. PE Uhyal. wee, (3.2.8) 


k=—N" 
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Proof. We claim that 


(N-1)N"-1 (N-1)N"-1 
>. a= > Be) (ken"}). G29) 
27 yn =—N" 


Once this is proved, the result follows since the sum on the right-hand side in (3.2.9) 
is 


N®t+l_] 
> POD = YW (ta) + W (Bop °F) 
k=0 (01,25-.52n41) 


= > W (o"y) W (oy) --- WO) 
yeXx, o@1(y)=x 


= Ri! () (x) = 1%) = 1, 


where we used Lemmas 3.1.1 and 3.2.1, and formula (2.4.7). 
We now prove (3.2.9) by induction. The case n = 0 reduces to the assumption 
(3.2.7). Suppose (3.2.9) holds up to 7 — 1. Then the next term is 


DIPS. Pcs x+k x+k+4+N" 
»: W WW w {—___— 
Se N NA Nat 


(N~1)N"7!-1 ; : 
=Rvy( yw (4)... (8) Joo 


=—Nn-l 


=1 by the induction hypothesis 
= Ry (1) = 1, 


and the induction step is completed. oO 


3.3 A sufficient condition for P, (Z) = 1 


In applications to wavelets, we noted in Chapter 1 that the scaling function g of an 
N-scale wavelet satisfies 


|o @ +4)? = Px (Ck). 3.3.1) 


It is well known that orthonormality of the corresponding N-adic wavelet system 
is equivalent to the normalization 


Yl@@+oP=1  aexe[0,1]. (3.3.2) 
keZ 


As a result, it is of interest to decide when the normalization property 
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Py (Z) =D Px ({) = 1 


keZ 


holds, or does not. 
Recall [BrJo02b] that a scale-N orthonormal wavelet is a system 


Wi,---,wn—1 € L?(R) 


Nip (Ww! ~k)) 
( Yi if 7,keZ, i=l,....N-1 


is an orthonormal basis for L* (R). 


such that 


Theorem 3.3.1. Let N, X =[0,1], W, , to, ..., tw-1, and Q be as in Proposition 
3.2.2. In particular, we assume that W:[0, 1] — [0, 1 ] is measurable and satisfies 
(3.2.7). For x € [0,1], let P, be the corresponding measure on Q determined in 
Lemma 2.4.1. 

Suppose in addition that there are no € N, b € Rx such that 


x+k x+k 
w )--” (Gazer) 25 forall p € Nand 


all integers k with — N*P <k <(N—1)N™tP. (3.3.3) 
Then it follows that 
Py (Z) = 1; 
ie., that 


> Px ({K)) = 1. 


keZ 


Proof. Let x € [0,1] be given. Suppose the numbers no and b have been chosen 
such that (3.3.3) holds. Then set 


x+k x+k 
Th (k) = X{-N",(N—1)N*) (k) W ) + W Gan) 5 


Let P, be the measures on Q from Lemma 2.4.1. Then it follows from Proposition 
3.2.2 that >" ,<7 fn (k) = 1. From Lemmas 2.4.1 and 3.2.1, we have 


‘lim, fn (K) = Pr (U8) 


and 
> Ps (AY) = Px (Z) <1. 
keZ 
But, using (3.3.3), we get 
fn (k) <b Py ({k}), n> No. (3.3.4) 


The conclusion P, (Z) = 1 now follows from the dominated convergence theorem, 
and a second application of Proposition 3.2.2. Oo 
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Remarks 3.3.2. (a) If W is continuous near x = 0, and if W (0) = 1, then condition 
(3.3.3) will be satisfied provided W does not vanish on the dyadic rationals kN~”, 
n> no. 

(b) Even though W is initially only defined on [0,1], we extend it to R asa 
1-periodic function. We already noted in Lemma 3.1.1 that Ry is acting on the 1- 
periodic functions on R. 


Exercises 


3.1. Let N be a (fixed) natural number, N > 1. For k& € {0,1,2,..., N — 1}, let % 
denote the transformation x — Nx +k, viewed as endomorphisms in Z. 

Set C := {0, —1}, and let D be the smallest subset of Z which contains C and is 
invariant under all the maps 7;. 

(a) Show that D = Z. 

(b) What other two-element subsets C of Z have the property that the smallest 
subset of Z which contains C and is invariant under all the maps 7; is all of Z? 

(c) (Note that the transformations 7j:x —- Nx +k, for k € {0,1,2,..., NM — 1}, 
leave No invariant.) Show that the smallest subset of No which contains {0} and is 
invariant under all the maps 7; is all of No. 


3.2. (a) Consider functions f and g as in (3.1.5): 
f@= Dez", g(z= > aez*. 


keZ keZ 
Show that 
f@g@® => hz" (E.1) 
k 
where 
Ik = > cpdi-p. (3E.2) 
peZ 


(b) Show that if (cx) and (dx) € €!, then (/,) in (3E.2) is also in €!. 
(c) Is the conclusion in (b) valid if £! is replaced with 7? 


3.3. Let (cx) € €7, and let a function f be defined by (3.1.5). Then show that f € 
L? (T), where T is given the Haar measure; and moreover that the Parseval identity 
Nellez = WFilzzeny 

holds. 
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3.4. Let f € L? (T) and consider the Fourier correspondence (3.1.5). Set 


1 
FN@= 5 Df). 


weT 


w* =z 


Show that lim,_,o. F” f exists in L? (T). 


3.5. Let f € L* (T) beas in Exercise 3.4, and consider (3.1.5). 
(a) Show that then 
(FA) @) = >i crz*. 


keZ 
(b) Let S = F* be determined by 


(Sflg)2=(f\Fg).2 forall KgeL’. 


Show that then 
(SA@=s(2),  zeT, 


and further that S is isometric in L? (T). 
(c) Show that 


(Sf) (@) = Sic, zeT, fel? (1). 


keZ 


References and remarks 


Some relevant references covering infinite products, as used in this chapter, are the 
books [Rud87, Zyg32], and papers which aim close to the present theme are [JoPe98, 
BeBK05, BeBe95, CvSS85]. 

The idea of using harmonic analysis of iterated function systems (IFS) and more 
general fractals in the study of similarity structures may originate with the pa- 
per [JoPe96] by Steen Pedersen and the author; see also [JoPe98], [BrJo99a], and 
[Str98]. 

The next chapter turns to harmonic analysis of affine IFSs. We will see that the 
kind of Fourier bases that can be devised for affine IFSs and wavelet bases have one 
thing in common: they are both better localized than Fourier waves. As a result (see 
[JoPe98], [Str05], and [Tao96]) both classes of the new bases have better conver- 
gence properties than traditional Fourier series; and the improvements come almost 
entirely from the fact that these new bases are better Jocalized: if you try to adjust 
locally in a basis expansion with a Fourier wave, you get errors at far distances. Not 
so for localized bases! 
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A case study: Duality for Cantor sets 


Closer to this book and equally illuminating are the many problems trig- 
gered by a sound or a picture. Only afterwards is a formula devised, and 
then proclamed.... —Benoit B. Mandelbrot 


PREREQUISITES: The Euclidean algorithm, recursion; the positional number rep- 
resentation; rudimentary examples of abelian groups; spectrum; Lebesgue; Fatou; 
Cantor. 


Prelude 


A well-known principle in Fourier series (reviewed in Section 3.1) for functions 
on a finite interval states that an orthogonal trigonometric basis exists and will be 
indexed by an arithmetic progression of (Fourier) frequencies, i.e., by integers times 
the inverse wave length. Similarly, in higher dimensions d, we define periodicity 
in terms of a lattice of rank d. The principle states that for d-periodic functions on 
IR?, the appropriate Fourier frequencies may then be realized by a certain dual rank- 
d lattice. In this case, the inverse relation is formulated as a duality principle for 
lattices; see, for example, [JoPe93] for a survey of this point. 

The purpose of this chapter is to extend, in a self-contained presentation, this 
duality principle to a class of fractals. 

In order to capture the essence of the idea, we have restricted the exposition 
here to those compact fractals that are realized by simple arithmetic on the real 
line, and which were considered first by Cantor. Hence, the middle-third Cantor 
set is an example. But in addition to scaling by 3, we shall consider more general 
notions of scaling, including (later in the book) matrix scaling for the multivariable 
case. Specifically, here we shall aim for compact Cantor constructions which take 
place in R?. 
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Our first general observation is that a fixed such affine fractal _X has an associated 
and canonical probability measure « (= 4.x). We define (Section 4.1) our measure 
i by a precise invariance property which is induced directly by the affine structure 
of X. The issue of the measure is a delicate one, as X, being a fractal, is a non-linear 
object and does not carry the structure of a group, or even anything close to that. (So 
no Haar measure!) 

Nonetheless the measure y allows us to formulate a natural Fourier principle, 
and to ask for “fractal” Fourier bases. Having 4, we may then imitate Fourier’s con- 
struction, i.e., we may ask for an orthogonal Fourier basis for L* (X, ). But since XY 
is constructed from recursively leaving out “fractions” (in R?), we will then expect 
that there is a dual “fractal-like” thinning of frequencies at infinity, i.e.,.a thinning 
relative to some rank-d lattice. Intuitively, we will expect a set A of Fourier frequen- 
cies to come from “fractals in the large.” So we ask for the complex exponentials 
e’“* indexed by 4 € A to form an orthogonal basis for L* (X, ). We shall refer to 
such an orthogonal basis as a Fourier basis, or a complete orthogonal set of Fourier 
frequencies. (Our choice of complex exponentials, as opposed to Fourier’s traditional 
wave-frames of sinusoids, is mainly convenience.) 

In the context of affine fractals Y, the surprise is that such orthogonal Fourier 
bases exist at all for some fractals, and not for others. In fact, it was previously 
believed that only asymptotic basis formulas were possible in the “fractal world.” 

If d = 1, we show that the basis principle works when Cantor’s middle-interval 
construction is done with scale 4, but not with scale 3. So, surprisingly, the famil- 
iar middle-third Cantor set (the first example that comes to mind!) does not have a 
Fourier basis, but if it is modified a little, changing the scale number from 3 to 4, 
then we show that Fourier’s basis principle holds. 


4.1 Affine iterated function systems: The general case 


We considered in Chapter 2 a general class of iterated function systems (IFS) XY 
with N branches, N > 2. They are defined in a general measure-theoretic category. 
We specified probabilities for random walks on the corresponding branches. They 
are determined by a fixed measurable function W on X and a normalization (see 
(1.2.3)). What results (Lemma 2.4.1) is a family of path-space measures P, on the 
probability space Q = {0,1,..., N — 1}, forx © X. 
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In Chapter 3 we saw that both the natural numbers No and the integers Z em- 
bed naturally as subsets of Q. As we saw in Chapter 1, the traditional integrability 
properties of wavelets in the Hilbert space L? (IR) turn out to be closely related to 
the function x +» P, (Z). For example, we proved that the orthonormal basis (ONB) 
property for the standard wavelets is equivalent to the identity 


P,(Z@)=1 ae.x eX. 


In other words, we are concerned with deciding when Z has full measure in Q. 

In this chapter, we consider instead a family of Cantor sets Y, totally discon- 
nected compact subsets of R of Hausdorff dimension s,0 < s < 1. As we show 
below, there is a natural basis question for these Cantor sets which is similarly re- 
lated to the function x + P, (No), i.e., to the size of No in Q. 

Each Cantor set is an affine iterated function system (IFS), and it will have a 
natural realization Y in [0, 1], and a conjugate one X in [—1, 0]. We will refer to 
them as the right-handed, resp., the left-handed version of X. 

The question we raise is when X has an ONB consisting of Fourier frequencies 
(see the definitions below). Let P,,x € ¥, be the path-space measures. We then show 
(see also Theorem 5.4.1) that the ONB property for X is equivalent to the identity 


P,, (No) = 1, xeX, 


In other words, in this case, the requirement is that No have full measure as a subset 
of Q. 
The function 
W (x) := cos* (22x) (4.1.1) 


came up in an earlier study by Jorgensen and Pedersen [JoPe98]. In that paper, we ask 
which Cantor sets have orthonormal bases (ONB) {e, | 2 ¢ A} for some A C R. 
Here 

e, (t) = el, reR. (4.1.2) 


While wavelets have fractal features, it turns out that a class of affine fractals, 
such as Cantor sets with division scale 3 or 4, have wavelet-like features. We illustrate 
this with two examples, and we then cover the general theory later in the chapter; see 
Sections 4.34.4. 

One of the results in [JoPe98] states that a certain class of Cantor sets Y do admit 
orthonormal bases of this form for suitable choices of sets A. If some X admits an 
ONB {e, | 2 © A}, we say that (X, A) is a spectral pair and that (e,),¢, is a Fourier 
basis. 

The following two examples (Figure 4.1) illustrate this point: in Figure 4.1(a), 
we sketch the middle-third Cantor set X3. It is constructed from 


o:x +> 3x mod Z 
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and the two branches of a —!, 


1 (x) == and st (x) = a 
The Cantor set 4 in Figure 4.1b is constructed from 
a:xh 4x mod Z 
and the two branches of o~!, 
2 (x) = and 1 (x) = as (4.1.3) 


We prove in [JoPe98] that 3 does not have a Fourier basis, while X4 does. 
Even so, Dutkay and Jorgensen showed in [DuJo06b] that all the affine IFS fractals 
X admit orthonormal wavelet bases. These wavelet bases are constructed from the 
ambient Hausdorff measure “(dx)*” of dimension s (= the Hausdorff dimension 
of X), and they are realized in a separable Hilbert space built from (dx)*. We call 
them “gap-filling wavelets.” To help appreciate the examples, we recall the Hausdorff 
measure (dx)*, 0 < s < 1, and its restriction to the corresponding Cantor sets; see 
also [Fal85] and [Hut81]. 

Let N e€ N, N > 2, be given. Pick a subset B C R such that the points in B 
represent distinct residue classes in Z,/ NZ, i.e., such that N does not divide b — b’ 
when b and 0’ are distinct points in B. Then there is a unique measure “ = 4(W,B) 
on R such that 


ee -l1 
b= F(B) hot , (4.1.4) 


where ty (z) := (x + 6) /N. Let p := #(B). Then the support of uv, g) is a Cantor 
set X(y,B) of Hausdorff dimension s = In p/In N = logy (p). Hence the Hausdorff 
dimension of X3 is log3 (2), and for X4 in Figure 4.1(b), it is log, (2) = 1/2. 
For these fractals, the Hausdorff dimension equals the scaling dimension. We will 
not go into details here, but refer instead to [Fal85, Fal90]. The scaling dimension s 
is typically easier to compute; the formula is 
___ log (aumber of replicas) (4.1.5) 
“~ Jog (magnification factor)’ a 
To be precise, we say that X(w,p) has a Fourier basis if, for some A, the family 
{e, | A € A} is an ONB for L? (uw,2)). 


4.2 The quarter Cantor set: The example W(x) = cos*(2zx) 


Returning to X4 in Figure 4.1(b), we recall the following lemma from [JoPe98]. 


4.2 The quarter Cantor set: The example W(x) = cos? (22x) 


0 1/3 2/3 1 
l : | | I 
| | | | 
0 1/3 2/3 1 
(a) X3: middle-third Cantor set. 

0 1/4 1/2 3/4 1 
pe eifias Ne, 2 lfie Aife sie} | | eared eae eee 

: pot | 

| | | | l 
0 1/4 1/2 3/4 1 


(b) X4: the quarter Cantor set. 


-1 —3/4 -1/2 —1/4 0 
Ee hi | | | [ea e4) 

= = = + 

| | | | 

= -3/4 “172 —1/4 0 


= 1; 
(c) X4:= — Dn a | 1; € {0, 1} , the conjugate or 
i= 
left-handed quarter Cantor set. 


Fig. 4.1. Cantor sets. 
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1 1 
Lose | 


wi 
wid 


who 
\ 
=) 
who 


Fig. 4.2. The Cantor set X3. 


1 1 
3 3 
i j 
2 2 
1 i 
4 1 4 
0 1 0 1 


Fig. 4.3. The quarter Cantor set X4. 


Lemma 4.2.1. Let the set X4 with associated measure yx denote the Cantor construc- 
tion of Figure 4.1(b) and (4.1.3)4.1.4). Specifically, yu is the Hausdorff measure on 
the fractal X4 with Hausdorff dimension 1/2. Let 


A= Aq:= {40 +14 + 14? +.-- | 1; € {0, 1}, where the sums are finite | : 


Then { e, | 4 € A} is orthonormal in L? (X4, 1). Equivalently, (X4, A) is a spectral 
pair. 


Remark 4.2.2. The main result in this chapter is that {e, | 2 € A} is in fact an 
ONB. 
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Fig. 4.5. X4, the conjugate or left-handed quarter Cantor set. 


Proof of Lemma 4.2.1. Setting 


CW = | ete) due), (4.2.1) 
we get the scaling relation 
_ | ime ¢ 
CO©=5(i+e )e(§). éeR. (4.2.2) 


Since the inner products are 


(ea lev )y = I ei (x) ey (x) du (x) =C (aA), (4.2.3) 
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f fi f 
a 1 ' 
1 1 ' 
1 1 1 
1 ‘ 1 
1 i 1 
1 ' 4 
1 1 n 
1 fl ! 
I 1 i 
1 ' 1 
1 5 1 
1 i 1 
i t i 
t fl 1 
i 1 t 
' 1 5 
t i 5 
1 1 a 
1! 1 3 
t 1 ‘ 
' 1 ' 
1 1 f 
y y y 


— od -1 
HHH HHH 
re un 


a 5 ast 
X4 
Fig. 4.6. Alternate limiting approach to the conjugate quarter Cantor set Y4. 


we need only show that, if 2’ 4 / in A, then the product 
Z m (a’—A) 
vanishes for some n. If 2’ # A, there is a first term where J; # J). If it is n, 


then 2’ — 2 € 4” - (+1 +42), and the last factor in (4.2.4) is then cos (+ is ) = 
cos (+4) = 0. o 


4.3 The conjugate Cantor set, and a special harmonic function 


Our next lemma is also from [JoPe98]. For an alternative to the argument from 
[JoPe98], see also an appendix to [JoPe98] written by R. Strichartz [Str98]. We state 
the lemma here without proof. The reader will easily be able to supply the details. 


Lemma 4.3.1. 


(a) To verify the ONB property, it is enough to verify that the function 


Ane) := DIC@-AP (4.3.1) 


AeA 


is constant and equal to |, x € R. 
(b) Setting W (x) := cos” (27x), see (4.1.1), we get 


Rwha =ha, (4.3.2) 
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where 


(Rw f) (x) := w(rG)+"(FZ)r(FF). (4.3.3) 


Remark 4.3.2. We establish the fact that {e, | 4 € A} isan ONB for L? (X4, ) by 
showing that the eigenvalue problem (4.3.2) has only one solution 4, which satisfies 
ha (0) = 1, and which is Lipschitz continuous. 


We now turn to the random walk and the measures P, on Q = {0, 1} : 

Now the measures P, will be indexed by the conjugate fractal X4 in Figure 
4.1(c), i.e., the one constructed from o:x t» 4x mod Z, and the pair 

1 (x) = >, 11 (x) = x (4.3.4) 

We say that the Cantor set 4 in the interval [—1, 0] constructed from the IFS 
(4.3.4), see Figures 4.54.6, is the conjugate of the quarter Cantor set X4 from Fig- 
ures 4.34.4. Both sets are affine fractals with Hausdorff dimension | /2: see formula 
(4.1.5). 

Let X4 be the Cantor set of Figures 4.3—4.4, i.e., based on (4.1.3). The Cantor 
set of the alternative system (4.3.4) will be denoted by .%4, and we refer to it as the 
conjugate Cantor set to X4. It is a little harder to visualize than the fractals in Figures 
4.1(a) and 4.1(b). It is a little different from the fractals in Figures 4.1(a) and 4.1(b), 
in that the gap configuration appears to have less separation. However, that is just a 
“visual effect,’ as one can make the gaps appear the same size as in the first one by 
just plotting the image on [ —1/2, 0] instead of [—1, 0]. This fractal 4 in Figure 
4.1(c) may be written as 


=| S5 E conn], (4.3.5) 


and we call it the conjugate or left-handed quarter Cantor set (or the “back side” or 
“flip side” of the quarter Cantor set). It has the same Hausdorff dimension s = 1/2 
as does X4 from Figure 4.1(b). 

Applying Lemma 2.4.1 to W (x) = cos? (2x) and the fractal 4 in (4.3.5), or 
Figure 4.1(c), we get the following formulas for the measure P, on Q = {0, ip: 


PAG cash = Too (Gare ea =): (4.3.6) 


oe 2-4P 
and 


ae (4.3.7) 


Py ({oo (A)}) = | | cos? Fe) = |e Ler )22¢¢4.4) 


bay 2-4P 
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where A (71, ..., i) denotes the cylinder set (2.3.5), and 
o(A)= (i, ..., in, 0, 0, 0, ...) (4.3.8) 
ace’ 
oo string of zeroes 


for 
A=iptindtp4 +---+i,4"7). (4.3.9) 


In view of (4.3.8)}{4.3.9), we may identify No and A. Both are uniquely represented 
with points in 9 which terminate with an infinite string of zeroes, i.e., one of the 
form (i),...,%,,0,0,0,...) for some n. 

a 


0 
It follows in particular that A, or equivalently No, has measure | if and only if 


Py (No) =hae) => [lex beadeayol =] forallx eX. (4.3.10) 
AEN 


4.4 A sufficient condition for P,, (No) = 1 


Proposition 4.4.1. [JoPe98] With the stated properties, we have P, (No) = | for all 
xeXx, 


Proof. Let 
1 
1 ifA e Aandd < — (47-1), 
Xn A) = 3 ( ) 
0 otherwise, 
and set 
Peak 
(n) heed 2 au (x >= 4) 
Fy? (A) = xy, (A) Il cos (Ga “AP ; 
p=0 
Then we get 
FO @Oeat (4.4.1) 
Aen 
It is clear from (4.3.7) that 
lim F® (A) = P, {@(A)}). (4.4.2) 
noo 


To prove that P, (No) = 1, we must establish that the convergence in (4.4.2) is 
dominated. 

For n large, the terms a (x — A) / (2 - 4”) are close to 0 for p > n; and we may 
pick some b > 0 and no such that 


Ul cos” GSR) > b. (4.4.3) 


p=no 
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We conclude that 
F (4) < bP, (fo (A), n>no, AEA. (4.4.4) 


This is the desired domination, and we therefore may apply the dominated conver- 
gence theorem to the sum >°,., ---. The conclusion is 


Py (No) = > |e [ea ese = 1, 
AEA 


which is the desired result. oO 


Remark 4.4.2. In the next chapter, we will prove a theorem, Theorem 5.4.1, for 
general path-space measures, which shows when No has full measure in Q. As in the 
proof of Proposition 4.4.1 above, our reasoning there will be based on a domination 
argument. 


Conclusions 


When Lemma 4.3.1 and Proposition 4.4.1 are combined we conclude that the set 
{e, | 4 € A} is an orthonormal basis (ONB) in the Hilbert space L? (X4, L); i.e, 
that the Cantor set X4 has a Fourier basis. The orthogonality of the functions e, is 
the easier part of the argument. It follows from (4.3.2) in the lemma. But only by also 
proving that the special function A, in (4.3.1), the minimal Ry-harmonic function, 
is in fact the constant function 1 are we able to infer that the set {e, | A € A} is total 
in L? (X4, 14), ie., that it is an ONB. 

However, this step is quite analogous to a key argument which we already en- 
countered in Chapter 1 for wavelets. The argument for the L?-density of the linear 
span of {e, | 4 € A} hinges on (4.3.10) as follows: First use (4.3.10) to show that 
every ey can be approximated in L? by functions in the span of the e,’s. Then use 
Stone—WeierstraB to infer that the span of the e,’s is dense in L?. 


Exercises 


4.1. Let to and 7 be the transformations in (4.1.3), and let _X4 be the quarter Cantor 
set X4 in Figure 4.3 (p. 74). Show that 


X4 = 1 (X4) U 7] (X4). 
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4.2. Formulate and prove the analogous result for the usual middle-third Cantor set 
X}3. 


4.3. Let to, 71, and o be the transformations of X4 introduced in connection with 
(4.1.3). 
(a) Compare the following two sets of Borel probability measures on X4: 


Mz = {hI poo! =n} and M,={ule=4(uorm!+yor;")l. 
(b) Estimate the cardinality of the two sets M, and M, in (a). 
4.4, Give a direct verification of the scaling law (4.2.2) in Lemma 4.2.1. 


4.5. Try to carry out the arguments in Lemma 4.2.1 for the middle-third Cantor set 
X3 in place of X4, and check what goes wrong. 


References and remarks 


Readers not already familiar with fractals might wish to consult some relevant refer- 
ences covering analysis on affine fractals, e.g., the very readable books [Fal85, Fal90, 
Bal00], and/or the papers [LaNg98, FaLa99, LaNRO1, FeLa02, BrMo75, Hut81, 
JoPe98, Hal83]. As for a student-friendly presentation of fractals in many parts of 
mathematics, we can recommend Devaney’s lovely little book [Dev92]. 

The regularity issues that are the focus of this chapter are related to harmonic 
analysis questions from the theory of affine iterated function systems (IFS), and there 
is a substantial amount of work by Strichartz and others that spells out more connec- 
tions to diverse areas of mathematics and applications. The treatment we gave above 
is especially inspired by [Str98], and [Str00]. In the context of our Chapters 4—5, 
these papers address the question from Proposition 4.4.1 above; see also Theorem 
5.4.1 below. The question and the general theme were central to the paper [JoPe98]. 
In fact [Str98] came about from Strichartz’s suggested alternative approach to the 
result in [JoPe98]. And other papers followed [JoPe98], for example [LaWa02] by 
Laba and Wang. 

At the conclusion of our work on this book, we received a preprint [Str05] from 
Strichartz which continues the harmonic analysis theme from [JoPe98] and [Str98], 
but in a different direction, viz., convergence: The new preprint [Str05] addresses a 
degree to which fractal Fourier series from [JoPe98] and the papers following it tend 
to be more localized, and as a result have better convergence properties than do the 
classical Fourier series. This localization of the Mock Fourier series is interesting as 
it is analogous to a strong localization property of wavelet bases, and we turn to that 
in detail in Chapter 7 below. 

There is a vast diversity of geometric structures with self-similarity, or with an in- 
herent consistency of scale. It includes wavelets, fractals and classes of models from 
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dynamics. However, within this variety, the fractal structures that are most amenable 
to mathematical analysis have their self-similarity defined by affine mappings (in 
ambient Euclidean space). But even this more narrow focus, restricting to affine 
mappings (or rather affine function systems), encompasses both standard and non- 
standard wavelets, as well as some of the best-known fractals. Our present viewpoint 
is to study these geometries in the light of Fourier duality. 

But there are lots of other viewpoints: A delightful and student-friendly presen- 
tation of the class of fractals we have in mind here is the little book [YaHK97] by 
Yamaguti, Hata, and Kigami. It is one of the few available books which aims to unify 
wavelets and fractals, but its aim is potential-theoretic: Laplace operators, resistance 
inequalities, and so on. 
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Infinite products 


Albert Einstein to Oscar Veblen (a math professor): 
“The Lord God is subtle, but malicious he is not.” 
(Raffiniert ist der Herr Gott, aber boshaft ist Er nicht.) 
And ten years later: 
“T have second thoughts. Maybe God is malicious.” 
—Albert Einstein 


PREREQUISITES: Riesz; random; Fubini; square-integrable functions. 


Prelude 


In the first two chapters, we introduced the random walks that are used throughout 
the book. We outlined this in the context of endomorphisms of compact spaces X and 
combinatorial trees; and we showed in Chapters 3 and 4 how this applies to wavelets 
and fractals. The combinatorial trees to keep in mind for illustration are sketched in 
Figures 1.1 (p. 8) and 2.1 (the Farey tree, p. 42). Recall further that the transition 
probabilities in the random-walk model are assigned via a prescribed function W on 
X which is assumed to satisfy a certain normalization condition. Within the context 
of signals, W is the absolute square of some frequency function m, or of a wavelet 
filter. The various paths within our tree can originate at points x chosen from the 
set X. As before, X carries a fixed finite-to-one endomorphism o. If x and y are 
points in X such that o (vy) = x, then the number W (y) represents the probability 
of a transition from x to y. Step-by-step conditional probabilities and finite products 
are used in assigning probabilities to finite paths which originate at x. (The simplest 
instance of this idea is for the case when_X is the circle, i.e., the one-torus T. For each 
N, we may then consider o (z) := 2’. And in the context of wavelet constructions, 
we introduced the additive formulation of the distinct branches of the inverse of 
z —> 2 when z is complex and restricted to T.) 
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Here we are concerned with the case of infinite paths. For this we must use infinite 
products coupled with a fundamental idea of Kolmogorov. These infinite products are 
the subject of the present chapter. The measure which assigns probabilities to subsets 
of paths starting at x is called P,. To understand the measures P, and their support, 
convergence issues for infinite products come into play. 


ats) Ls 


The study of infinite products has a long history in analysis, see especially [BeBK05] 
and the references given there; in number theory [Mey79], lacunary trigonometric 
series [Zyg32]; and in ergodic theory [BrMo75, Kak48, Kat87], as well as in other 
applications, e.g., [BeBe95, Rit79]. A special case of the infinite products we dis- 
cuss in the present chapter includes the so-called Riesz products [Rie18], and their 
many variants. While the treatment in [BeBKO5] is centered around certain multi- 
scale properties of a class of generalized Riesz products, our present discussion is 
motivated instead by our probabilistic viewpoint, i.e., the use of random walk, iter- 
ated function systems (IFS), and probability theory. 


5.1 Riesz products 


5.2 Random products 


In this chapter, we consider a general class of infinite products which is connected to 
the random-walk systems from Chapter 2. A special case of these infinite products 
arises in the analysis of wavelets, to which we shall turn in the next chapter. 

We begin with a lemma. 


Lemma 5.2.1. Let (X, B) be a measure space as described in Chapter 2. Leta: X + 
X and N be as described, and let to, ..., ty, be a choice of branches of o!. Let 
W: X — [0,1] be measurable and satisfy (2.4.1), i.e., 2S W (yy) = 1 for all 
x € X. Then the infinite product 


F(x):= sim [TI W («f (x)) , xeXx, (5.2.1) 
k=1 


converges; and the identity 
F(x)=W(to(x))F(to)), x«eX, (5.2.2) 
holds if and only if P, ({0}) > 0. 
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Proof. As in Chapter 2, set Q := {0,1,...,N—- 1}, and let 0 = {0,0,0,...} be 
the point in Q which is given by an infinite string of zeroes. Let A(0, 0, ..., 0) be 
ed 


n times 


the cylinder sets from (2.3.5), and let P,, x € X, be the Radon measures constructed 
in Lemma 2.4.1. By (2.4.2), we have 


n 
so k 
P, @ 0, a 0) = I] Ww («3 )) (5.2.3) 
n times 
Since mn 
{0} = (] AQ, 0, ..., 0) 
pele” Coe 
n times 
and 
A(0,0,...,0) C A(O,0,..., 0), 
—— ame been anummametl 
n+l times n times 
we get 


Py ({0}) = lim Ps | A(@,0,..., 0) 
n times 


In view of (5.2.3), the desired conclusion (5.2.1) follows, and 
F (x) = Py ({0}). 


The second conclusion (5.2.2) is also immediate from the convergence of the prod- 
ucts in (5.2.3) asn — oo, Oo 


5.3 The general case 


We begin with a remark concerning the embedding of natural numbers in ©. 


Remark 5.3.1. Let (7,...,i,,0) be the point in Q which is the concatenation 
of (i|,...,%,) with an infinite string of zeroes. For the measure of the singleton 
{(,..., in, 0)} we then get 


Py CG, er aes: in, 0)}) 
= W (ti,x) +++ W (tin + THX) F (ti, ++ THX), xeXx, (5.3.1) 


By the Euclidean algorithm, the natural numbers k € No = {0,1,2,...} have a 
unique representation 
k=iptioN+---+i,N" (5.3.2) 
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with i, € Zy = {0,1,..., N — 1}. We recall that No is naturally embedded in Q. If 
n is fixed, (5.3.2) is a unique representation of all k,0 < k < N” —1. It is convenient 
to identify k with the singleton {@ (k)} = {(@i,...,in, 0)} in Q. So each k € No 


represents a unique singleton in Q, and the mapping k > (i1,...,%n, 0) is 1-1. 
Hence 
foe} 
P (No) = >) Ps ({k}), (5.3.3) 
k=0 


or more precisely, 


Py (No) =>) DD Ps (Ga, -- in OY). (5.3.4) 
n=1 (ij,...,in) 


Proposition 5.3.2. Let X, o, to, ..., tv-1, W, and (Px)xex be as specified in 
Lemma 5.2.1. Then the function 


h(x) := Px (No), xeX, (5.3.5) 
is harmonic for Ry. 


Proof. Since P, is a Radon probability measure on Q for each x € X, by Lemma 
2.4.1, it follows that the following infinite-sum representation for h (x) is convergent: 


h(x) = > yi W (tix) +++ W (tin ++ THX) F (tin + THX) - (5.3.6) 


n=1 (ij,...,in) 


With the Ruelle operator R = Ry, we get 


N-1 
(Rh) (x) = DW (tox) A (5x) 


s=0 
= > > W (tsx) W (ti, Tsx) . 
SR ityesin 
+++ W (tin +++ Ti tsk) F (tiy «++ Ti TX) 
=h(x), 
which is the desired conclusion. Oo 


5.4 A uniqueness theorem 
In our next result, we give a sufficient condition for the property 
P, (No) = 1 (5.4.1) 


to hold. In [Gun00], the process (P,),<y is said to be tight if (5.4.1) is satisfied. 
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By now the transfer operator is often called the Ruelle operator because of D. 
Ruelle’s use of it in the 1960s on phase-transition problems in statistical mechanics 
[Rue69]. The probability aspect of this endeavor was stressed by R. Gundy [Gun00], 
and others, e.g., [CoRa90]. Our presentation here relies in crucial ways on Gundy’s 
viewpoint, as it seems to unify the different threads in the subject that came before it. 
Three early elements of the theory may be summarized in three different but closely 
related aspect of filters (as filter response functions) in the form of: (1) tilings (A. 
Cohen), (2) the transfer operator (W. Lawton), and (3) cycles (I. Daubechies, among 
others). For more citations, we refer to the comments at the end of this chapter, and 
to the references in the papers and books cited above. 

Mathematically, a filter is just a sequence, but a very special one, and filters 
were used originally in operations on discrete time signals. In its simplest form, 
this operation is merely the Cauchy product of sequences. If the filter sequence is 
taken as the Fourier coefficients of a periodic function m, then we say that m is 
the frequency response function. The operation on functions is now just pointwise 
multiplication. The numbers that occur as filter sequences are closely related to the 
masking coefficients of computer graphics; see, e.g., [BrJo02b] and [Jor03]. 

The filters have been used in the form of subband filters, and quadrature-mirror 
filters, in image/signal processing for half of a century, and it is from signal engineer- 
ing of processing that the subject of wavelet theory has adopted such engineering 
terms as low-pass/high-pass filters, down-sampling, up-sampling, and perfect signal 
(or image) reconstruction. 

Since wavelets ideally aim for orthogonal bases, initially in the Hilbert space 
L? (R), properties of filters which detect orthogonality of the basis functions were the 
focus of attention for both the initial work on wavelet filters, and the subsequent more 
probabilistic approach. The probabilistic approach in fact merges the three initial 
criteria, i.e., tiling (A. Cohen), the transfer operator (W. Lawton), and cycles (graph 
theory on trees, discrete paths, and random walk). Each of the three elements alone 
misses some important features of orthogonality tests for filters, with filter response 
functions that are singular, for example filter functions that are only measurable, and 
not continuous. This generalization is not just idle abstraction, but is motivated by 
uses on wavelet bases with localization in frequency bands; see, e.g., [BaMe99]. If 
more smoothness in the time domain is imposed, then (by uncertainty) irregularities 
in the frequency domain tend to pop up. 

A further advantage of the new meeting ground for analysis and probability is 
its impact on other basis issues in harmonic analysis, such as those arising in the 
joint work of the author with S. Pedersen and D. Dutkay, e.g., [Kat87], [JoPe96], 
[DuJo06b], and related research by R. Strichartz, Y. Wang and others. See [Jor03], 
[JoPe98], [DuJo06b], and the papers cited there, for additional discussion and refer- 
ences. 

We aim to have the theorems which follow in this chapter combine the diverse 
ideas and results that came out in the last two decades. While these results originated 
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with diverse mathematical problems or engineering applications in mind, they all 
seem to share a fundamental mathematical principle, which perhaps now with hind- 
sight finally emerges as a sharper contour. So at this time, the subject seems to have 
reached a level of maturity, and we hope it is by now ready for the classroom. But the 
individual results (over the two decades) taken in isolation are likely to miss some of 
the common threads. 


Theorem 5.4.1. Let X, 0, to, ..., tv-1, W, and (Px)yex be as above. Suppose for 
some x € X there is anno € N and b € Rx such that 


]] 4 (,---%x) 2b forall (i,in,...). (5.4.2) 
Pano 
Then it follows that 
P, (No) = 1. (5.4.3) 
Remark 5.4.2. In probabilistic terms, condition (5.4.2) is an assertion about the tail 
sequences, i.e., the estimate 
P,(Zy x +++ x Zy x {@}) > b for alla €Q. (5.4.4) 
a 
No times 


Note that each set in (5.4.4) is finite, of cardinality N”°, but there is an infinite num- 
ber of sets. 


Proof of Theorem 5.4.1. Suppose condition (5.4.2) above holds. Let 1g and b be cho- 
sen as stated. Since W < 1, we get 


I] W (ti, «++ tix) > II W (tj,-°-+ tix) = for all > no. (5.4.5) 
p2n P2no 


Using the unique representation 
k=ij+igN+---+i,N""! — fork No, 


we set length (k) = n if i, is the last non-zero term, and 


n 
W (ti, +++ ti,x) if length (k) < n, 
FO®=a4pn 7 


0 otherwise. 


Then 
Se wet, (5.4.6) 


keNo 


and. 
lim, FS” (8) = Ps (A). (5.4.7) 
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1 1 1 
| ty 
1 1 
2 2 
: “ 
0 ay 1 O 1 0 1 
2 
Fig. 5.1. The 2-adic fractions. 
For n > no, we get the estimate 
F® (k) < b7'P, ({k}) fork E No. (5.4.8) 


Since 

>, Px ({k}) = Px (No) < 1, 

keNo 
the estimate (5.4.8) shows that the convergence (5.4.7) is dominated. Hence we may 
exchange the limits. Condition (5.4.6) shows that, as claimed, 


>, Px (id) = Py (No) = 1. o 
keNo 


Example 5.4.3. W (x) := cos* 32x), X¥ = [0,1], o:x — 2x mod 1, 7 (x) = 
(x +7) /2, i € {0, 1}. See Figure 5.1. The condition (5.4.2) is not satisfied for any 
x € [0,1], and 

0< Py (Z) <1. 


To see this, recall that if 
k= iy +2ig +--+ +2" Nin, 
then 
W (ti, --+ tx) = cos” (Fyn) 


To show that (5.4.2) does not hold, we check that for all 6 € R,, all m9 €¢ N, we can 
find n > no and k € No, k < 2"~!, such that 


cos? (AS) <b. 


Qn 
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Fig. 5.2. Powers of o. 


Since limp. (32x/2”) = 0, the conclusion follows: just approximate 1/6 with a 
sequence of dyadic rationals k/2”,0 < k < 2”~!. Then 


3 k 
lim cos” seth) = lim cos? oak 
(n,k) 2n (n,k) Qn 


3x 

2 

= “)=0. 
oe (3) 


As an application to the usual N-adic representation of fractions, we now read 
off the following corollary. First set X = [0, 1 ] = the unit interval, and let 


o (x) = Nx mod 1, and 


5.4.9 
rp Qx) ==, KETOA FSO NSA: O29) 


If N = 2, these mappings are graphed as in Figure 5.1. Moreover, for this example, 
we can compute 
h(x) = Px (Z) 


explicitly. In Section 6.2, we show that 


P, (Z) = (a ) 


3sinzx 


Corollary 5.4.4. Let X =[0,1], N EN, N > 2, be given, and let o, t, ..., TN-1 
be the N-adic maps of (5.4.9). Let W: [0,1] — [0,1] satisfy (2.4.1), ie., 


cS x+s 
pala 7 }=2. x €[0,1]. (5.4.10) 


s=0 


Let (Px) xefo,11 be the transition measures of Lemma 2.4.1. Then 
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h(x) = Px (No) (5.4.11) 
Co fw k 
= W (: + ) 
k=0 n=1 N” 


is convergent and represents an Ry-harmonic function on [0, 1 ]. 


Following Remark 1.4.2, we extend the measure P, on No to Z, making use of 
two copies of OQ = {0,1,...,N — 1. Identifying k € No with @ (4), consider the 
case —N” < k < 0. Then set 


Py ({k}) = Px ({o (er a k) \) . (5.4.12) 
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Then dulcet music swelled 

Concordant with the life-strings of the soul; 

It throbbed in sweet and languid beatings there, 

Catching new life from transitory death; 

Like the vague sighings of a wind at even 

That wakes the wavelets of the slumbering sea. . . 
-—Percy Bysshe Shelley 


We now use the martingale method of Remark 2.7.2 in the computation of the 
cocycle V associated with the distinguished harmonic function h from (5.4.11). 


Example 5.5.1. It may very well happen that the measure P, assigns zero mass to 
singletons in Q, i.e., that it is non-atomic. 

(a) Let Y = [0, 1], and let o be the endomorphism in Figure 5.1, ie., 0 (x) = 
2x mod 1. Let W be the constant function, i.c., W = 1/2, x € [0,1]. Then it is 
immediate that P,, is the product measure on Q = {0, 1} with weights po = pi = 
1/2 for all x € [0,1], and therefore P, ({@}) = 0 for all @ € Q. 

(b) Let g € L? (IR), and suppose ¢ satisfies the scaling identity (1.3.1) for N = 2 
and some sequence ax of masking coefficients with 


2 1 
>) aeans21 = 300.1 . 
keZ 


Set 
2 


W(x) = 


Dodger 
k 
Suppose 
[ocr ae =1. 
R 
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Then it follows from Corollary 5.4.4 that 
A 2 
Py (fo(k)}) =|@@+h/, keZ, (5.5.1) 
where w (k) = (@1, @2, ...) is given by Euclid and k = @ + @22 +--+» + @,277}, 
O0<k<2"-1. 
Proposition 5.5.2. There is a shift-invariant function v: Q — [0,1] such that 
V (x,@) =h(v (o)), aeQ, (5.5.2) 
is the cocycle corresponding to the harmonic function h in (5.4.11). Moreover, if 
@ = (@,@2,...)€ Q={0,...,N — 1}, then 


@\ a2 
v (@) = jim a (aa Peet +t). 


In particular, v vanishes on the N-adic rational fractions. 


(5.5.3) 


Proof. Since there is a 1-1 correspondence between harmonic functions and cocy- 
cles, see Theorem 2.7.1, the cocycle corresponding to / in (5.4.11) is unique. If we 
show that it has the form (5.4.11), it follows from Corollary 5.4.4 that the function v 
must be shift-invariant, i.e., that 


v (@1, @2,...) = 0 (@2, @3,...). (5.5.4) 


In view of Remark 2.7.2 (or the martingale convergence theorem), we know that, for 
Py, a.e. w, we have 


V (x,@) = jimh (tay + * Tag X) 5 (5.5.5) 
but 
Tom °° Tan ¥ = = +—— sa i+: 4% (5.5.6) 


and it follows that the limit in ee is independent of x. But i sequence of frac- 
tions in (5.5.3) or (5.5.6) is non-increasing, i.e., 
Se se a ee a oa ee 
N ~ Nutl ” Nn N 
The monotonicity follows from an easy induction argument. The fractions are clearly 
in the compact interval [0,1] CR. 
In order to understand the function v:Q — [0, 1] from (5.5.3), note that 


D(@1, @2, ..., @n, 0, 0, 0, ...) =9, (5.5.7) 
—— 


oo string of zeroes 


i.e., v vanishes on the N-adic rationals. oO 
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Exercises 
5.1. Let 
m(z)= > axz* e L*(T), 
keZ 
and set 


(S@=V2m@as(2),  fsevrm. 


(By L? (T), we mean the Hilbert space of all L?-functions on T where T is equipped 
with the unique normalized Haar measure.) 
Show that the following three conditions are equivalent. 


(i) S defines an isometry of L? (T). 
(ii) |m (z)|? + |m (—z)? = 1lae.z eT. 


2 be - 1 
(iii) > AkAk421 = 700.5 eZ. 
keZ 


5.2. Let N € N, N > 2, and let 


m(z)= > az" eL™(T). 


keZ 


Set 
(S)@=VNm@ fe"), fel dy. 


Show that the following three conditions are equivalent. 


(i) S defines an isometry of L? (T). 
N-1 : 

(ii) >, Im (zei2##/¥)| =lae.zeT. 
k=0 


1 
ili AkAk4NI = —00,1, 1 € Z. 
(iii) pa +N = 5 


5.3. Let N e N, N > 2, and let 


m(z)= > a4z* e L™~(T). 


keZ 


As in Exercise 5.2, set 
Sf (z) = JN m@)s (2") , feLl*(n. 
The adjoint operator / = S* is defined by the identity 


(Fflg)p=(f1S8g)p, fgel*(T). 
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(a) Show that 


(FA@Q=Te Y mw sw, fel, 27. 


weT 


wN =z 


(b) Let M = M, be the multiplication operator (Mf) (z) = zf (z), f € L? (T), 
z € T. Then show that M commutes with S*S, i.e., (S*S) M = M (S*S). 

(c) Does M commute with SS*? 

(d) Does MN = M.n commute with SS*? 


Zz 


5.4. Let N, m, and S be defined as in Exercise 5.3. Then show that S is a partial 
isometry (i.e., S*.S is a projection operator) if and only if there is a measurable subset 
E CT such that 
S*S = My, 
where 
(MAO =nmOs@, fel), zeT. 
5.5. Let (XY, B) be a measure space, and let be a probability measure defined on the 


a-algebra B. Let N e N, N > 2, and leto: X — X and %:X — X be measurable 
endomorphisms of X such that 


a O0t% =idy for0<kK<N. 
Consider functions m € L™ (X, B) =: L™. Set 


N-1 
> |m o tI" = 1 p-a.e. cox] F 
k=0 


OF (= | met 


Set 
(Smf)x)=VNm(x) fie), fel?(Xiw,xeX. 


Show that Sj): L? (X, 4) > L? (X, ) is an isometry for all m € QF (t) if and only 
if 4 satisfies 


w= DD woK. (5E.1) 
k=0 


5.6. Let N e N, N > 2, be given, and set 


xt+k 


o(x)=Nx mod Il, 
OS sare O<k <N. 


(a) Then show that the restriction to [0, 1) of Lebesgue measure on R satisfies 
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(b) With the identification R,/Z = [0, 1), and a function m € L® (0, 1), set 
(Sf) (x) = VN m (x) f (Nx mod 1), (SE.2) 


or 
Sf=V/Nmfoo, feL*(0,1). 
Then show that S is an isometry in L? (0, 1) if and only if 


k 2 
m (A ) =1  ae.on (0,1). 


N-I 


k=0 
(c) Let S = Sj, be defined as in (SE.2) for some m € L°®. Then show that S*S is 
a multiplication operator, i.e., that S*S = Mr for some F € L®™. 
(d) For F in (c), show that 
¢- + *) 
m 
N 


N-1 
F@=> 
k=0 
(e) What can you say about SS*? 
(f) Let m;, 7 = 1,2, be two 1-periodic functions in L°, and let S; be the cor- 
responding subdivision operators defined as in (5E.2). Then show that S}S2 is a 
multiplication operator; specifically, STS) is multiplication by 


SGC) 
1 2 . 
= N N 
5.7. Let ®:R — C bea given function such that the sum 
Av ® (x) = }) ®(x +n) 


neZ 


2 


is pointwise convergent a.e. x € R. Let W be a 1-periodic function on R (i.e., 
W (x + 1) = W (x), x € R), and suppose that for some N € N, N > 2, 


W (x) ®(x) = (Nx), x eR, 


holds. 
Set 


N-1 

x+k x+k 

R = W 

(Rv) @) 2 ( )r( a ) 

for periodic functions f on R. Then show that the function 
h(x) := Av® (x) 


satisfies 
Rywh=h, 


i.e., that / is a harmonic function for the Ruelle operator Ry. 
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5.8. Let W:R — C be a measurable function, and suppose (i), (ii), and (iii) below 
hold. 


Gi) W(x)=W(ex4+ljae.x € Rie. W is 1-periodic. 
(ii) For some N € N, N > 2, the infinite product 


CO 
x 
O(x): =] |” (=) (5E.3) 
is convergent a.e. x € R. 
N-1 
= x+ *) 
lil W 
cy SW 


Then prove that the function ® in (SE.3) is in L! (R), and that 


<laexeR. 


|® (x)| dx < 1. 


5.9. Let W € L° (IR), and assume that W is 1-periodic, and further that it satisfies 


W(x) +W (: + 5) =1 ae.x eR. (SE.4) 


(a) Show that f) W(x) dx = 1. 
(b) Show that for all k € N, the following identity holds: 


9k 3k-1 
fw)" G&) &= [.7"G)" @)# &) 
(c) Show that for ail k € N, the following identity holds: 
k-1 
LG) Ge) &) = 


(d) Suppose, in addition to the above, that W > 0 a.e. on R, and further that the 
infinite product 


k 
x 
= li W (=) 
OG) eT ee 
is (pointwise) convergent on R a.e. x. Then show that 


f- ® (x) dx < 1. (5E.5) 


(e) Under the assumptions on W listed in (d), give examples of when the inequal- 
ity (SE.5) is strict (i.e., <), and when it is “=”. 
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Hints to 5.9: (a) Introduce the RZ -Fourier coefficients (c,),<7, for the function 
W,, and note that the identity (SE.4) is equivalent to 


2¢2n = 60,n, neZ. 


Also recall that co = i W (x) dx. 
(b) Prove this by induction, noting that the case k = 1 is equivalent to 
1 1 
2 1 2 
/ w(+5) d= W (x) dx. 
-} -| 
But both sides of this last equation agree with the Fourier coefficient co (= 3) from 
(a). 
(c) Again this may be proved by induction. While the case k = 1 is clear, the 
next case k = 2 is instructive: 


[”® w (5) dx =f"® w (=) ax =a f W (x) W (2x) ax 


i 


2 1 
= 4 W W ~)|) Wx) d 
using that W is 1-periodic | ( @) a (x = ;)) ¢ x) . 


3 1 
Z 4 [° wes) dx =2 | W (x) dx = 1. 
(E4) Jo 0 (a) 


(d) For each k € N, we have 


gk-1 ik x 
acs Ul a =e) sai by (c) : 


Letting k — oo, we get the estimate (SE.5) by an application of Fatou’s lemma. 
Verify that Fatou applies in this case. 


References and remarks 


Some relevant references (books and papers) covering Riesz products are [Rie18, 
Mey79], and a list covering related infinite-product constructs includes the following: 
[Zyg32, May91, Kat87, Kak48, BeBKO5]. 

The idea of using the Ruelle operator and the Kolmogorov extension principle in 
the construction of wavelets from Perron—Frobenius eigenfunctions was suggested 
in Jorgensen’s Memoir [Jor01a], and in two papers by Dorin Dutkay, [Dut02] and 
[Dut04a]. 

One of the advantages of this approach is that we get multiresolution wavelets in 
other Hilbert spaces than the familiar L? (R“): for example, in Hilbert spaces built 
on fractals by the use of Hausdorff measure; see [DuJo06b]. See also [Dut04b]. 
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In the prelude, we mentioned that random-walk models are realized on infi- 
nite discrete sets S such as the standard lattices Z4, d = 1,2,..., and on various 
combinatorial trees rooted at points in a given compact space. The first model is 
widely studied in probability (see, e.g., Revuz [Rev84], Stroock [Stro05], and Spitzer 
[Spi76]); and we have stressed here uses of the tree models in the study of scale- 
similarity, as it is used in the analysis of wavelets and of fractals. 

In all the models, it makes sense to introduce spaces of paths, and path-space 
measures, and there is a separate literature on this; see e.g., [Nel73] and [Sim79]. 
Somewhere between these two structures (lattice and combinatorial tree) there are 
yet other uses of the idea and related probabilistic models. They are encountered for 
example in thermodynamics, e.g., in percolation models from mathematical physics 
[FiEs61]. The infinite sets S we meet there could be Cayley trees, Bethe lattices, or 
bond graphs; see figures 1 and 4 in [FiEs61]. In these models the standard combina- 
torial tree is modified. Instead of a distinguished “root” or starting node, these models 
allow more complicated cactus-like root configurations, and the resulting shapes are 
called pseudolattices. While this theory does combine analysis and probability, it is 
however beyond the scope of our book; and we refer the interested reader to [FiEs61] 
and the papers cited there. 

The relation between our tree in Figure J.1 (p. 8) and the Bethe lattice is like the 
relation between a lattice of N-points (non-negative integer coordinates) and a lattice 
of Z-points (integer coordinates of either sign). (In a Bethe lattice there would be a 
third subtree attached above the point labeled “‘x.”) The “Bethe lattice” is uniform, 
having no distinguished root, point, origin. The shapes in the figures from [FiEs61] 
are “cactus-like” in appearance. 
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The minimal eigenfunction 


Anything that, in happening, causes itself to happen again, happens again. 
-—Douglas Adams 


PREREQUISITES: Eigenvalues, eigenvectors; eigenfunctions; a curiosity about wave- 
lets as seen by a petroleum engineer. 


Prelude 


This brief chapter serves as a bridge between the iterated function systems (IFS) in 
Chapter 4 and the more detailed wavelet analysis in Chapters 7 and 8. We wish to 
examine a variation of scale, i.e., examine the effect on the Perron—Frobenius—Ruelle 
theory induced by a change of scale in a wavelet basis. After stating a general result 
(Theorem 6.1.1), we take a closer look at a single example: Recall that Haar’s wavelet 
is dyadic, i.e., it is a wavelet basis for L? (IR) which arises from the operations of 
translation by the integers Z, and by scaling with all powers of two, i.e., scaling 
by 2/, as j ranges over Z. But the process begins with the unit box function, say 
supported in the interval from x = 0 to x = 1. The scaling by 3, ie. f > f (x/3), 
stretches the support to the interval [ 0, 3]. It is natural to ask what happens to an 
ONB dyadic wavelet under scaling by 3, i.e., f — f (x/3). This is related to over- 
sampling: The simplest instance of this is the following scaling of the low-pass filter 
mo (x), i.e., mo (x) > mo (3x). 

Using Perron—Frobenius—Ruelle theory, it can be shown that the ONB property 
typically is lost, but still a number of essential wavelet features are preserved. Recall 
that a consequence of the ONB property is a version of the Parseval identity. Even 
though the ONB property is lost, it follows from our discussion below that the Par- 
seval property is preserved under stretching, i.e., the stretched wavelet will still be a 
Parseval frame, also called a normalized tight frame. 
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We have chosen to restrict our discussion of this to one example, the “stretched 
Haar wavelet.” However, we emphasize that much more is known, but beyond the 
scope of our book. While there are many ways to extend these developments, we 
have restricted the discussion here to one example. Still a lot more can be done: 
We know that if y is a frame wavelet (NV = 2) and if p is a fixed odd integer, 
then the scaled function (1/p) y (x/p) is also a frame wavelet. And even in higher 
dimensions, the theory takes a nice form. 

From wavelets in one dimension to higher dimensions: Instead of the familiar 
1-D translation by Z and scaling with all powers of two, in higher dimensions d the 
translation will instead be by a rank-d lattice and we will be scaling by an integral 
and expansive matrix A, i.e., scaling by 4’, as j ranges over Z. 

In the context of the path-space measures P, and the associated harmonic func- 
tions from Chapter 2 a lot is known about change of scale. Specifically, in Chapter 
2 we introduced the family of probability measures P, for the case of wavelets on 
IR? (with an expansive matrix 4 over Z), and we showed that each P, is atomic. Is 
the transformed P, family again atomic? Or in other words, are the transformed (or 
stretched) P, measures still concentrated on a rank-d lattice, i.e., on some version 
of Z4, just as in the 1-dimensional case. Further it is important to consider the effect 
on cycles (for the random walk) under “‘matrix-stretching,” and the effect on other 
structures such as the zeroes of the P,-harmonic functions. And try the same for 
self-affine tilings. Or develop a theory for an over-sampling in the general case of 
higher dimensions and even for fractals. 


6.1 A general construction of /min 


The function / (x) := P, (No) is special among the solutions h > 0 to the eigenvalue 
problem 
Rwh=h. (6.1.1) 


Our next result shows that x +» P, (No) is minimal among the solutions to (6.1.1), 
and we shall use the notation 


Imin (x) = Px (No), x eX. (6.1.2) 


In an indexed family of wavelet examples, Figure 6.2 (p. 107) contains plots of these 
functions Amin. 
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Theorem 6.1.1. Let X, B, N, W be as in Chapter 2, and a: X — X an endo- 
morphism such that #o~! ({x}) = N, x © X, and assume in addition that X is a 
compact Hausdorff space. Suppose branches of o~' may be chosen such that, for 
some measure vo on X, 


im > (wg) (Top To X) = v0 (g) = I gdvo (6.1.3) 


n>OoO 


(@1,....@n) 
for all g € C (Q). The function W™ is defined by 


W (y) = WOW ()-W(o" 0). (6.1.4) 
Leth € C (Q) be a solution to (6.1.1) such that h > 0. Then 
Amin (*) v0 (A) < A(X), xex, (6.1.5) 


Remark 6.1.2. With the normalization vg (4) = 1, (6.1.5) reads Amin < h, and hence 


the reference to Amin as the minimal eigenfunction. We saw that there are examples 
where hin = 0. 


Proof of Theorem 6.1.1. Let k € No, and consider the N-adic representation k = 
iy +igN +---+i,N"—!. Note that 


o(k) = (il, Genera ins 0, 0, 0, ce es 
—_—_—_——— 
oo string of zeroes 
and that 
h(x) = Ry ?A(x) (6.1.6) 


= DW (iX) +++ W (ton +++ to X) + 


(01,--.On+p) + W (Tens.p jae TX) h (Tons ioe tx) : 


For n fixed, let p > 00, Then 


lim DW (taX) +++ W (toy ++ Toy) 2+ (6.1.7) 


prow 
(On 415--@ntp) 


1 W (Ton+.p ig Teer) h (Tonsp see TX) = P, ({@ (k)}) vo (A) 


for k = @ +a@2.N +--+ +@,N"7!. 
(6.1.8) 


We now exchange limits, and use Fatou’s lemma in the form 
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Combining (6.1.6) and (6.1.7), we get 


>, Px (0 FY v0 (A) < h(x). (6.1.9) 


keNo 


But the factor on the left-hand side in (6.1.9) is P, (No) = Amin (x), so the desired 
estimate (6.1.5) follows. oO 


Remark 6.1.3. The wavelet representation is included in the framework of the the- 
orem. Then X = [0, 1 ], and vo = do = the Dirac mass at x = 0; see [BrJo02a], and 
[BrJo02b, Proposition 5.4.11]. 


6.2 A closed expression for /min 


Example 6.2.1. Haar wavelets. We now illustrate the minimal eigenfunction for 
Ry with the following family of Haar wavelets; see also [BrJo02b] for more details. 
First, note that the simplest version of the scaling identity (1.3.1) from Chapter 1 is 


g@t)=9(2t)+eQ2t-1), teR (6.2.1) 
The corresponding wavelet function y is given by 
y (t) = 9 Qt) -¢ Qt—-1), teR. (6.2.2) 


Even without Fourier transformation (1.3.3), it is immediate by inspection that 
the equations (6.2.1)}+(6.2.2) are solved by the Haar father function, 


() Is 0<t<i, (623) 
Ma 0 otherwise, = 
and the Haar mother function, 
1, 0<t<1/2, 
y(yj= 7-1, 1/2<t<1, (6.2.4) 
0 otherwise; 


see also Figure 6.1(a). Also consider the functions 


gp (t)= +5 (<) ‘ teR, (6.2.5) 
Pp \p 
and 
ee (<) . teR, (6.2.6) 
P \p 


for p = 3,5, odd numbers. The pair g3 and y3 are sketched in Figure 6.1(b). 
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Fig. 6.1. Haar’s wavelets (compare Figure 1.2, p. 13): Haar’s two functions g (6.2.3) and y 
(6.2.4) are illustrated in the two columns: The system in the column on the left in the figure 
illustrates Haar’s orthonormal (ONB) wavelet basis, and the scaling identities are visually 
apparent. The system of functions 3 (6.2.5) and w3 (6.2.6), for p = 3, in the right-hand 
column illustrates the two stretched Haar functions. The two scaling identities (6.2.1) and 
(6.2.2) are visually apparent for both function systems, on the left and on the right. Moreover, 
for both systems, the y function yields a double-indexed Parseval basis (see (1.3.16)), i.e., a 
function system for which the Parseval identity (6.2.12) holds for every function / in i (R). 
If p = | (the left-hand-side case), then the inequality in (6.2.11) is an equality, i.e., is an =; 
while for p = 3 (the right-hand-side column in the figure), the term on the left in (6.2.11) 
is strictly smaller than || f Ge R for non-zero f. The reason for this is the overlap for the 
Z-translated functions, which is graphically illustrated too. 
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As suggested from Figure 6.1(b), the wavelet yp, for p = 3,5,..., is called the 
p-stretched Haar wavelet. 

We leave it to the reader to check that the low-pass filter functions mp for the 
wavelet pair gp, Wp are 


1+ e i2a px 


’ R, 
2 xe 


Mp (x)= 
and as a result, : 
Wp (x) = |mp (x)|° = cos? (zpx). 


It is known, and easy to check, that the orthonormality relations hold for 9, y, 
but not for gp, wp, p = 3,5,.... 
For 9, w, the orthonormality relations are 


lollz2qy = Iv lza@e = 1, (6.2.7) 
{o(- —k)|k € Z} is orthonormal, (6.2.8) 
1e., 
(9 19(- ~—4))p2@y = %,k; keZ, 
and 


{2"?v (2"r-k) |n,ke z} is an orthonormal basis (ONB) for L? (R). (6.2.9) 


In contrast, gp, Wp Satisfy 


1 


lop | 12(R) = | Vr liam = Wi (6.2.10) 
Der -OIS)P <I ize) (6.2.11) 
keZ 


for all f in the closed subspace Vo spanned by the translates { gp (- —k) | k € Z}, 
and 


>> (2? ve (2" . —k) | rif =IflRagy forall fe 1? (R). (6.2.12) 


k,neZ 


The properties (6.2.11)}-(6.2.12) are called frame properties. Basis functions sat- 
isfying such a generalized PARSEVAL identity are called Parseval frames, or normal- 
ized tight frames. This frame requirement is evidently weaker than the requirements 
of an ONB; see (6.2.8}{(6.2.9). 

The discrepancy between them may be measured by the autocorrelation coeffi- 
cients from (6.2.8): since (6.2.8) is not satisfied for gp, p = 3,5, ..., the 1-periodic 
function 
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hy (x)= > (Gp | Gp (- —b))2@) ai (6.2.13) 
keZ 


is not constant. Of course h; (x) = 1,x ER. 

In many applications, it is too much to require that the wavelet produce a true 
ONB, i.e., that (6.2.9) be satisfied, and the weaker requirement (Parseval frame) 
(6.2.12) is typically good enough. For these wavelets, the minimal eigenfunction 
satisfies 

0 < Py (Z) < 1, x e[0,1], 
as we shall now illustrate. 

From the analysis in Chapter 1 above, we note that there is a wavelet transfer 
operator Rp = Ry, for each p = 1,3,5,.... An inspection of (6.2.5}(6.2.6) 
shows that 

Wp (x) = cos” (px). (6.2.14) 
Now take 
v= 609 =d6(x —0). (6.2.15) 
Then it is clear from (6.2.14) that 
VRp =v, (6.2.16) 


or equivalently, 


| Pofav= f fav. feCc({o, 1). 


Using now Lemma 1.4.1 and the Fourier transform (1.3.3), we also conclude that 


Z 2 sin zp (x —k) ‘7 1 /sinapx 2 
NeG= 2 lee) => | =p —b ) =o las) 
(6.2.17) 
which is a closed expression for the minimal harmonic function; see also [BrJo02b, 
page 332, Figure 6.3]. 
It follows from (6.2.13)}(6.2.17) that 


and that 
Rw,hp =hp, (6.2.18) 


and we conclude from Theorem 6.1.1 that hp is the minimal harmonic function. 
Specifically, if # is any other continuous function satisfying 


h(0) =, Ry,h =h, 0<hA<i1, (6.2.19) 


then 
Ap<h<} pointwise. (6.2.20) 


See Figure 6.2. 
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Exercises 


6.1. (a) Set W (x) = cos* (37x), and show that 


Il" (@)="a 


(b) Verify by a direct argument that 


while 


6.2. Let w be the Haar wavelet function from Figure 6.1 (p. 103), and set 
wir (t) = Dy (2% = k) 


(a) Show that if 7 € No, and k = 0,1,...,2/ — 1, then the functions yw; are 
supported in the unit interval J = [0, 1]. 

(b) Show that the functions y;, indexed by 4 := {(U,4) | 7 € No, k = 
0,1,...,27 — 1} form an orthonormal basis (ONB) in L? (1) when J is given the 
Lebesgue measure, i.e., that { wx | (j,k) € A} is an ONB for L? (J). 

(c) Sketch the basis functions in the cases j = 0, 7 = 1, and j = 2. 

6.3. Can you find other “interesting” functions y on J = [0, 1] and subsets 4 Cc 
Z x Z such that the family { yw; | (j, 4) € A} forms an ONB for L? (J), where 
Wik (t) = 2/2 y (Ut —k)? 

6.4. Let w3 be the stretched Haar wavelet function from Figure 6.1 (p. 103), and set 


vO =? ys (24 - k) 


(a) Show that if j ¢ No, and k = 0,1,...,3 (2/ — 1), then the functions y 
are supported in the stretched interval J; := [0, 3]; see Figure 6.1. 
(b) Show that the functions vo indexed by 43 := {G,k) | j € No, &k = 


0, 1,...,3(2/ — 1) } form a Parseval frame for L? (Js), i.e., { vr (j,k) € A3 | 
is a normalized tight frame in L? (/3); ies that 


> =f If OP at 


GU k)eA3 


(wQ |r) ‘a 


ee ) 


holds for all f € L? (43). 
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h = h3 (3x) 


Fig. 6.2. Solutions to the eigenvalue problem (6.2.19). 


References and remarks 


Some relevant references (books and papers) covering the Ruelle operator and its 
Perron—Frobenius eigenspace are [Kea72, PaPo90, Bal00, BrJo02b, CoRa90, Jam64, 
May91, Rue02, Rue69, Rue94}. 

The wavelet transfer operator is used in a variety of wavelet applications not 
covered here, or only touched upon tangentially: stability of refinable functions, reg- 
ularity, approximation order, unitary matrix extension principles, and data mining, to 
mention only a few. The reader is referred to the following papers for more details 
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on these subjects: [DaHRS03, RoSh03, RSTO1, JiJSO1, RoSh00, RiSh00, JiSh99, 
She98, RoSh98, LaLS98, RoSh97]. 

Several of these topics and papers invite the kind of probabilistic tools that we 
stress in our book, but a more systematic discussion is outside the scope of our book. 
We only hope to offer an introduction to a variety of more specialized topics. 
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Generalizations and applications 


One cannot expect any serious understanding of what wavelet analysis 
means without a deep knowledge of the corresponding operator theory. 
—Yves Meyer 


PREREQUISITES: Having seen a few approximation problems; spectrum; tensor 
product of matrices; unitary matrices; Hilbert space; a rough idea about recursive 
algorithms (visual representations); large matrices; a curiosity about some cute ideas 
from engineering. 


Prelude 


In our earlier discussion of “wavelet-like” bases in Hilbert space 1, we stressed 
the geometric point of view, which begins with a subspace Vo and two opera- 
tions. For standard wavelets in one variable, 1 will be the Hilbert space L* (R), 
and a suitable “resolution subspace” Vo will be chosen and assumed invariant un- 
der translation by the group of integers Z. In addition, it will be required that Yo 
be invariant under some definite scaling operator, for example under “stretching” 
f > f (x/2). Thirdly, the traditional multiresolution (MRA) approach to L* (R)- 
wavelets demands that the chosen subspace Vo be singly generated, i.e., generated 
by a single function , the father function, i.e., the normalized L*-function which 
solves the scaling identity (see (1.3.1) in Chapter 1). As is known, it turns out that 
these demands for a subspace Vo are rather stringent. 

Hence in this and the next chapter, we shall generalize the familiar MRA picture 
in two ways. First, we shall allow Vo to be multiply generated: more than one father 
function. When Vo is multiply generated, it is said to give a generalized multireso- 
lution analysis (GMRA). Secondly, we shall extend the MRA idea to other Hilbert 
spaces than L? (IR) or L? (R®). 
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We saw in Chapter 4 that the idea of a “multiresolution” applies to fractals. An 
inherent feature about fractals X is a scaling law. Intuitively a fractal X is a “fern- 
like” geometric object which looks the same when a natural “scale” is varied, i.e., 
the same at small scales and at large scales. The notion of scaling can take many 
forms: it can be arithmetic (e.g., dyadic scaling, N-adic scaling, or matrix scaling), or 
geometric (e.g., iterated composition of rational functions of one complex variable), 
or symbolic such as is employed in encoding of discrete automata-dynamics by shifts 
or by substitution in a finite alphabet. 

We will see that the geometric feature of a traditional wavelet “multiresolu- 
tion” is very adaptable. If we think of a multiresolution as a selection of a distin- 
guished subspace Vo in a “fractal” or symbolic Hilbert space 71, then it turns out that 
computational algorithms from wavelet theory carry over. But in these more non- 
traditional variants of “multiresolutions,” perhaps the analogue of the familiar group 
of Z-translations in the Hilbert space L? (IR) is not immediately transparent. Yet it 
is well known, and recalled in (7.1.4) below, that the group of Z-translations in the 
Hilbert space L? (IR) is Fourier equivalent to multiplication in L? (R) by the Fourier 
frequencies e'2***, (Note that the spectral transform in (7.1.4) below is more gen- 
eral. Specifically, the transform for a generalized multiresolution (GMRA) should be 
thought of as an extension of the basic Fourier equivalence principle!) And it turns 
out that all the three non-traditional variants of “multiresolutions” do have abelian 
algebras of multiplication operators which leave invariant suitable resolution sub- 
spaces Vo. 


7.1 Translations and the spectral theorem 


Special to this chapter is a set of sequential figures and illustrations: Figures 7.1, 
7.2, and 7.3 (decision-tree view, pp. 111-113), and Figures 7.4 and 7.5 (graphs of 
two classes of wavelet packets, pp. 118-121); the sequence from Figure 7.6 to 7.15 
(subspace analysis, pp. 123-128, 132-133); and Figure 7.16 (visual exhibit of the 
sequential cascade algorithm, here applied to the two Daubechies wavelet functions 
g and y, p. 134.) 

Indeed, the central theme in the whole book is captured in the figures here in 
Chapter 7. We resume specific and detailed citation of these figures by number later 
in Chapter 7. 
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Fig. 7.1. The Fourier transform of the basis sequence (before scaling and selection) generated 
from the scaling function go, using the full combinatorial tree. 


The first block of figures, i.e., Figures 7.1, 7.2, and 7.3, illustrate how the pyra- 
mid algorithm from Chapter 1 takes shape as a decision tree. While in our present 
applications we emphasize the wavelet context for all of this, in fact it is the same 
underlying idea which is used in our fractal constructions from Chapters 3 and 4. 
And the same idea may be used even more generally in a variety of data structures 
which happen to come equipped with some inherent notion of scaling and similarity, 
such as is known to be the case for digital data clouds, for graphs and for certain 
manifolds. 

For a number of applications, it is convenient to have a matrix variant of the 
measures P, constructed in Chapter 2; see especially (2.4.6). For the wavelet ap- 
plications, this is relevant when the scaling identity, or refinement equation, (1.3.1) 
does not have solutions that are scalar-valued L? (IR)-functions. For multiwavelets 
this happens when the starting resolution subspace Vo C L? (IR) has multiplicity. By 
this, we mean that the representation of Z by translation in VY is not cyclic. Equiva- 
lently, setting 


Tf x)= fa -&), fev, x ER, ke Z, (7.1.1) 
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Fig. 7.2. The generation of the basis functions gg, 1, 92, ... by use of the algorithm (7.5.5) 
initiated with gg = g such that g(t) = 2>¢ ja 1,9 (2t — 7), the functional equation by which, 
in the first (A) branch, go “re-generates” itself. 


we get a spectral representation in the following form: there is an isometric isomor- 
phism 


Ji Vo > L? (81) ® L? (82) B+ (7.1.2) 
such that 
+ Siq1 CS, Ce CH CS, C[0, 1] (7.1.3) 
and 
JRfser™ Tf keZ, fev. (7.1.4) 
To see this, note that, by the spectral theorem, there is a projection-valued measure 
p:[9, 1] — projections of be (R) (7.1.5) 
such that ’ 
nf =f et anid) f, keZ, fer. (7.1.6) 
0 


To get the sets S;, 7 = 1,2,..., take 
S;:={Ae€[0,1] | dim p (A) > j}. (7.1.7) 


The reader is referred to [BaMe99], [BaMM99], [BaCM02], and [BaJMP04}. 

These applications motivate two generalizations of the more traditional wavelet 
setup, see, e.g., [Dau92]: first, the wavelet filters and the functions W (see Chapter 1 
above) are not continuous, and second, they are matrix-valued. 
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Fig. 7.3. The generation of the basis functions 9], 92, 93, ... by use of the algorithm (7.5.5) 
initiated with 9; = w. 
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7.2 Multiwavelets and generalized multiresolution analysis 
(GMRA) 


We now outline the generalizations of the multiresolutions from Chapter 1 which 
are entailed by the case of multiply generated resolution subspaces Vo C L? (R). 
The multiplicity function yw is identically = 1 for the cases covered by (1.3.1). For 
the general case, we consider g: RR — H for some Hilbert space 1, and we are 
interested in solutions g € L* (R). In this interpretation, the coefficients a, from 
(1.3.1) are operators ay: Ht — H, and the filter function (1.3.2) is operator-valued. Its 
Fourier-transformed variant (1.3.4) must also be interpreted in this context: ¢: RR > 


H, m (x): > H, and 
0 (x) =m (=) 7) (=) 2:1) 


is an identity for vectors in 1. Since 


2 es a 2 
[loot a= [Jeol dx, 


we need a product formula for ||¢ (x) Woe 
Iteration of (7.2.1) yields 


I6 lx = (8 (Fa) | "0% Fa): 
where 7 P 
Wy (x):=m (=) +m (=) m (=) -s-m (=) ; (7.2.2) 


The function m is assumed to satisfy the normalization condition 


d mo)moy =n, xe X. (7.2.3) 
o(y)=x 


To state our result, we first provide the setting for our operator-valued variant of 
Lemma 2.4.1. 


7.3 Operator-coefficients 


Definitions 7.3.1. (a) Let H be a complex Hilbert space. An operator T in H is said 
to be positive (alias semidefinite) if (u | Tu) > 0 for all wu € H. The inner product 
in H is denoted (- | -), and is assumed linear in the second variable. If (Q, Bg) is a 
probability space, we say that a function 


P: Bg > Pos(H) 


is an operator-valued measure if: 
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(i) P (-) is o-additive, 
(ii) 0 < P(A) < 1, or equivalently 


O< (uj P(A)u)<lul?, weH, 4€ Bo, 
and 
(iii) P (Q) = I. 


(b) Let CX, B) be a measure space, and let W: X — Pos (H) be a measurable 
function, i.e., for every u € H,x ty (u | W (x) u) is measurable, and 


0<(u| W(x)u) < |jul’. (7.3.1) 


Let o: X — X be an N-to-1 measurable mapping, with measurable inverse branches 
T),.-.-, TN—1. Suppose 


N-1 
> Wot = 1H. (7.3.2) 
i=l 
We then define a corresponding transfer operator 
Rvf= >. WO)fo) (7.3.3) 
yeXx, o(y)=x 


acting on L°° (X, 71), i.e., bounded measurable functions f: X — H. 


With these definitions, we now outline how the main results in Chapter 2 carry 
over. We state them with only sketches of proof, as the modifications in Chapter 2 
that are required are mainly technical. The main added issue is the non-commutativ- 
ity, so when we write products like 


W (o"'y) W(o"y) W (oy) Wy), (7.3.4) 


then we mean products of operators acting on 71. 
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Our next result, Lemma 7.4.1, is an operator-valued variant of the scalar case above, 
in the form of Lemma 2.4.1. In the operator-valued context, we get again the exis- 
tence and uniqueness of the measures P,. 


Lemma 7.4.1. Let_X,o, N, to, ..., tv—1, H, and W be as described above. 


(a) Then, for every x € X, there is a unique positive operator-valued Radon proba- 
bility measure P, on Q = {0,1,...,N— 1}N such that 


Py, (A (i, eta acts in)) =W (ti,x) W (t1,14,) WW (ti, sil Ti, X) , (7.4.1) 


where A (i1,..., i) denotes a cylinder set as defined in (2.3.5). 
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(b) Let the assumptions be as in (a), but consider instead the operator function 


Wr (it, ---5in3 X) 
=m (tj, --- TX)" +++ (tax) m (tix) +++ (ti, + THX) 5 
with m satisfying (7.2.3). Then, for every x © X, there is a unique Radon mea- 
sure P, on Q such that 
Py, (AG, ---54n)) = Wa Gi, ---5 inp X)- 


Proof sketch. For each n € N, consider functions f:Q. > H such that f(@) = 
f (@1,...,@n), and define 


PLA = LW (tax) +++ W (Com ++ TX) f (@1s--25@n)- (7.4.2) 


(1 ,--,On EZ, 


As in Chapter 2, we check that pert) [f]l= Pe” [ f ]. Using this, the conclusion 
in Lemma 7.4.1 may be obtained as in Chapter 2, mutatis mutandis. Oo 


We now list some of the other conclusions which carry over. 


Theorem 7.4.2. Let the setting be as described in the lemma, and let (Px),<x be the 
corresponding operator-valued process. Then (i){v) below hold. 


(i) The infinite product 
[] ¥ (28 @) = F&) (7.4.3) 
n=1 


exists if and only if P, ({0}) > 0, where 0 = (0, 0, 0, ...) € Q. The condition 
00 string of zeroes 
is that the operator Px ({0}) has zero kernel. 


(iit) The function 
h(x) = Px (No), xeX, (7.4.4) 


solves the eigenvalue problem 
Rwh=h. (7.4.5) 


(ili) As before, P,,(-) extends from No to Z, according to formula (1.4.5). 


(iv) Returning to the multiwavelet case of (7.1.1)1{7.1.7), let o, t, ..., tw~1 be 
given as in (5.4.9). Choose 9, 92, ... in Vo such that 


DV GOED GO; K+H = 5,5 x5. (7.4.6) 
keZ 


Then 
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Xs, 0... 
h(x) =P, (Z) =|] 9 %y, °°" 


solves (7.4.5). In this case the matrix function x +> W (x) is as described in 
[BaJMP04]. 


(v) In the setting of (iv), the multiplicity function p of (7.1.7) is 


K@)= DV ae +H’. (7.4.7) 
i keZ 
and the projection Q (x) onto the range subspace of 
JV ={FC + ez | fe Mo} (7.4.8) 
is 
Oi) = DIG TDG +H). (7.4.9) 


Specifically, the range of Q (x) is the subspace 
[(F@ +i ez P| fe Vo}. 


and yt (x) is the dimension of this subspace. 


Our next result is motivated by the theory of wavelet packets of Coifman et al.; 
see [CoMW95], [CoWi93], [Wic93], and the book [Wic94] by Wickerhauser. 


Lemma 7.4.3. Let X, 0, N, to, ..., tn-1, H, and (W;)o<ij cn be given satisfying 
the conditions in Definitions 7.3.1, and (7.4.10). 

We continue with the setup from Definitions 7.3.1, but with the following addition: 
instead of a single operator function 


W: X > Pos(H), 
we consider a family 
W;: X — Pos(H), i=0,1,...,N—-1, 
each one satisfying (7.3.2), i.e., we assume that 


> %O)= In, i=0,1,....N—-l1,xeX. (7.4.10) 
yeXx, o(Qy)=*x 
Then for every x € X, there is a unique positive operator-valued Radon proba- 
bility measure P, on No x Q such that 


Py {(@, 6) € No x Qloai =H, ..., On =ini =i, «+s Gh = Jn) 
= Wi, (t,X) Wig (titrX) + Win (Cin tx). AAD 


As usual, we are identifying No with a subset of Q. 


118 7 Generalizations and applications 


Fig. 7.4. The first thirty-two functions in the sequence gy 
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in the case of the Haar wavelet construction. 
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Fig. 7.5. The first thirty-two functions in the sequence gy 
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in the case of the Daubechies wavelet construction. 
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Proof sketch. For each n € N, consider functions f:No x Q — H such that 
f (@,€) _ tf (1, +5 On3 C1, ...5n). Define 


POUAI= SD DS Mer (t4x) Won (t,t) > 
(@),..,@n) (C1,---26) 
++ Wo, (te, see tz,x) St (@1,---,@ns €15---5&), (74.12) 


and check that 
PEO, f] = PO [Ff]. (7.4.13) 


To see this, note that by taking n sufficiently large, we may assume that @ has the 
form @ = w (k) for some k € Ng, i-e., that 


o(k) =(@1, ..., @n, 0, 0, 0, ...), 
—S_ ee’ 
oo string of zeroes 


where 
k= tanN +---+a,N"!. (7.4.14) 


Using (7.4.12), we then get 


per [fl= > 2 Won (26%) Wo (te, 74,%) °° 
(1 5-+sOn)  (C1-+-26n) 


+++ Woy (te, +++ TEX) > % (tis Re + te R). | F Ony avs Ons Cli seeyGe) 


on+l 


But note that D°-,, Wo (tz, -*-) = lla by (7.4.10), so the desired consistency re- 
lation (7.4.13) follows. Now the last step, extending the consistent family (Pi) 


néeN 
to a Radon measure on No x Q, follows the reasoning used earlier in Lemmas 7.4.1 
and 2.4.1 above. o 


Remark 7.4.4. The measures P, in Lemma 7.4.3 are typically not probability mea- 
sures. See Proposition 7.5.2 for details. 


7.5 Wavelet packets 


Remark 7.5.1. In the remaining chapters, we illustrate our presentation of wavelets 
and wavelet packets with graphics series, both in Chapters 7 and 9. We begin by 
recalling the wavelet filters and the corresponding pyramid algorithm which is used 
in the construction of a sequence of wavelet packet functions go, 91, 92, .... Starting 
with the Daubechies wavelet filter, Figure 7.5 illustrates the first 32 functions in the 
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Fig. 7.6. A decreasing family of resolution subspaces Vp D V1 D V2 D ---, and correspond- 
ing detail spaces W,, W2, .... Dyadic wavelets. Vy = Vyasa) + Wy41.n =90,1,.... 


sequence. As is expected from the theory, one observes the time-frequency behavior 
which makes the wavelet packets especially adaptable to a given data set, or a given 
image (in the case of two variables). 

The next seven figures serve to illustrate the multifaceted features of pyramid 
algorithms, and to stress the distinction between the algorithmic paths of the standard 
wavelet construction and the choices going into the selection of a function in the 
library of a wavelet packet construction. 

The standard dyadic case begins with formulas (1.3.18) and (1.3.19), while the 
N-adic case uses (1.3.1) instead of (1.3.18), but then (1.3.19) is replaced by N — 1 
identities corresponding to the N — 1 higher-frequency subbands, and the associated 
subband-filter functions m,, m2, ..., my-—1, as illustrated in Figures 7.9 and 7.10. 
The distinction between the dyadic case, i.e., N = 2, versus the case of more than 
two subbands in the encoding of subspaces, is illustrated in Figure 7.10 in the special 
case of N = 4. 

Although subband filters were first used in signal processing, see Figure 7.7, they 
have now been adapted to the wavelet algorithm as outlined in Figures 7.8, 7.9, and 
7.10. The coefficients in the two filter functions mo and m, of (7.5.1) and (7.5.2) 
are at the same time the masking coefficients for the first two wavelet functions, the 
father function g = go and the mother function y = g 1. While for the wavelet 
algorithm, the same dyadic choice is made in each scaling/subdivision step, in con- 
trast the wavelet packet algorithm is encoded by a string of separate dyadic choices, 
as is illustrated in Figures 7.11 and 7.12, and in the theory part of this section; see 
especially formulas (7.5.5)(A}(B), as well as Figure 7.5 (pp. 120-121). The Math- 
ematica program code for the graphics in Figure 7.5 is included in the “References 
and remarks” section at the end of this chapter. 
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Fig. 7.7. Subband filtering for dyadic wavelets: mg = low-pass filter, mj = high-pass filter, 
@ down-sampling, ( up-sampling. The figure refers to input (from left) and output (to the 
right) of signals from standard signal processing; for example, of speech signals. From the left, 
we start with (signal) input. The signal is then split into two frequency bands with two filters 
mg and m,. The first filter, mo, passes low frequencies, and the second, m1, passes high. The 
split into frequency bands is called “analysis.” We get two signal strands which are now each 
followed by signal down-sampling. As we move right, the two bands are then up-sampled. 
The processed signal bands then pass dual filters, and are finally merged again with + into 
an output signal. “Perfect reconstruction” means that the output signal to the right is matched 
up perfectly with the input from the left. From engineering, we know that it is possible to 
find filters mq and m, that achieve perfect reconstruction; i.e., recovering the input signal by 
synthesis of the bands. Magically, and by hindsight, these filter systems serve at the same time 
to give us orthogonal wavelets, subject to a technical condition discussed in Chapter 5. (There 
can be more than two frequency bands, and we refer to Section 7.6 for the general case.) 


The tiling aspect of the fundamental subdivision/wavelet packet issue is perhaps 
especially clear for Haar’s construction as illustrated in the corresponding series of 
pictures in Figure 7.4 (pp. 118-119). Different geometric and algorithmic formu- 
las are further illustrated in other forms in the pictures and formulas collected in 
Figure 7.6 (subspaces), Figure 7.7 (the engineering approach to subbands), Figures 
7.8—7.9 (subbands and matrix operations), Figures 7.10—7.11 (a particular path in the 
wavelet-packet tree), and Figure 7.13 which spells out the relation between the sub- 
spaces and the associated matrix operations. The same issues are illustrated with the 
representations from Section 7.6 below. Figure 7.16 (p. 134) outlines the subdivision 
algorithm which goes into the graphics, creating a step-by-step design of both the 
father function g and the mother function y for the Daubechies wavelet. The pair 
of formulas behind Figure 7.16 are special cases of (1.3.1), (d) in Example 1.3.3, 
(1.3.19) from Chapter 1, as well as (7.5.1}{7.5.2), and (7.6.6}-(7.6.7) in the present 
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Fig. 7.8. Wavelet algorithm in the dyadic case: Vy = Vyz41 + Wy4i.n =0,1,.... 


chapter. The general use of the pyramid is discussed and illustrated further in Figures 
8.1 and 8.2 in the next chapter (pp. 159-161). 

The corresponding algorithms and figures are created by Brian Treadway in 
Mathematica. The series of pictures to follow have been created so as to offer visual 
illustrations of some of the basic wavelet algorithms, both the father function/mother 
function algorithms, and the related pyramid algorithms used in the creation of wave- 
let packets. Following the progressions in the picture series, the reader will be able 
to follow visually the corresponding algorithmic steps, see for example the (A) and 
(B) parts of equations (7.5.5) and (7.5.6). 

Caution: Readers comparing the present figures and definitions with other re- 
lated ones in the literature, for example in [BrJo02b], will notice some differences 
in normalization conventions: For example, here we normalize so that the low-pass 
property reads |m(-)|* = 1 at frequency zero. This is to stress the probabilistic 
meaning of the term |m(-)|* for low-pass. The meaning of “low-pass filter” from 
signal processing: The signals with low frequency will move through the eye of the 
low-pass filter with high probability, while they will be essentially blocked by the 
high-pass filters. 

Besides the normalization convention, there is another difference between (2.5.25) 
in [BrJo02b] and our present equation (7.5.5)(B): The index on a (or a) is 5 — & in 
[BrJo02b], while it is 1 — & (actually 1 — /) here. 

This shifts the support interval for the corresponding mother function by an inte- 
ger. To keep Daubechies’ scaling function gp, and the wavelet packet series 9, 92, 

. confined to the interval [ 0, 3 ], we need the index to be 3 — k. 

This does not affect the computation, but it changes the tick marks on the hori- 

zontal axes of the final figure plots. When comparing the present pictures with others, 
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Fig. 7.10. A decreasing family of resolution subspaces Vo D V; D Vz D «++, N-adic case, 


N =4, Va-1 = Va t+ Wns Wn = WO 4 WO 4 WO n= 1,2,.... 


the reader should allow for translation on the x-axis by integer amounts for the func- 
tion g; (= yw); note that there is an integer translation in the frame setup. And for 
the higher wavelet packets, the offset appears to be by non-integral amounts such as 
1/2, 1/4, 1/8, ete. 
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Fig. 7.11. A selection of a dyadic wavelet packet. (An example of a path in a pyramid!) 


Fig. 7.12. Illustration of a path in the pyramid: my — m, > my; % my > m, -, or 
V0 7 VW. > W3 7 3 Ws. 


Similarly, the normalization of m mentioned above will not affect the computa- 
tion of g, provided that a suitable factor appears in (7.5.5): in (7.5.5)(A), for exam- 
ple, go is given as a convolution of itself with the row of a’s, multiplied by the initial 
factor, so the magnitude of that initial factor is determined by a sum rule. 

As we have our “normalization” in (7.5.5), the Haar case, i.e., Example 1.3.3(a), 
is ag = a, = 1/2, and the other a;’s zero. As noted above, our “normalization” here 
is a little different from [BrJo02b]. 

Here we have >! a; = 1; see (7.6.6). 

The a; numbers for the Daubechies wavelet are given in Example 1.3.3(d), and 
they are ao, ..., a3 for the non-zero ones, starting with ag = (1 + /3) /8, etc.; see 
(7.6.6). Remember >" a; = 1 from our choice (1.3.14). 
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Matrix multiplications: 
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Fig. 7.13. Combined filter/down-sampling in matrix form. 


The algorithm in (7.5.5) is simple. Here are the steps. The two first g; functions 
are familiar: ¢9 = g (father function), 91 = yw (mother function), and we have them 
in Fig. 1.2(a), the Haar case (p. 13), and Fig. 7.16, the Daubechies case (d) (p. 134). 

The general case: List the numbers ag, a}, ..., up to the last non-zero number. 
Then do the first step (A) in (7.5.5) to get from g1 to g2. 

The second step (B) in (7.5.5) amounts to running the numbers a; in reverse, and 
alternating the signs; see (7.6.7). The second step gets us from 9 to 93. 

Now back to the first step in (7.5.5), and we get from g2 to 4. Then return back 
to the second step; this time applied to g2, and we get gs. 

And so on: All the while, walking zig-zag through the algorithm; the first step 
takes us from g;% to 2%, and the second step from gy, to g2x¢+1 (see Figures 7.2 and 
7.3, pp. 112-113). 

After having found the first two, go and g1, in the usual way, one procedure for 
doing all the steps in succession is simply making a string of zig-zag steps as follows: 
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(A): 1 > 2; (B): 1 > 3; (A):2 > 4; (BB): 25 5; 
(A): 3 > 6; (B):3 > 7; (A): 4 > 8; (B):4 > 9; ete. 


This zig-zag procedure is done “wholesale” in the Mathematica program: instead of 
1> 2,13 3,2 5 4,2 5, etc., it is expressed (equivalently) in terms of lists: 


1 — {2,3} > {{4, 5}, (6, 7}} > {{{8, 9}, (10, 11}}, {(12, 13}, {14, 15}}}, 


etc., with each arrow representing one instance of the “map” operation. The nested 
sublists are “flattened” to a single list only in the final step. 

In fact, the implementation starting with go gives the whole series from 0 at each 
stage because go regenerates itself in step (A): 


0 — {0, 1} — {{0, 1}, {2,3}} — {{{0, 1}, {2, 3}}, {{4, 5}, {6, 7}}} > ete. 


We now turn to one of the uses of Lemma 7.4.3 in the theory of wavelet packets. 
To make the ideas more transparent, we state the result only in the special case when 
N = 2, and where the Z-translation-invariant subspace Vo is generated by a single 
function gp = g in L? (R) which is known to satisfy (1.3.1), or equivalently (1.3.4). 
Set 


mo (x) = > age", (7.5.1) 
k 
my (x) = >. (—1)* ane, (7.5.2) 
k 
and 
M=(|m|?, i =0,1. (7.5.3) 


The orthogonality conditions state that 
mo (x) mo (x + ) 
m(x) my (x + +) 


The two recursive formulas (7.5.5)(A)}B) form the basis for the so-called pyra- 
mid algorithm, i.e., the building basis functions in algorithmic steps following paths 
in a combinatorial tree, or algorithm. This is illustrated graphically both with the 
series of figures, Figures 7.1—7.3 (pp. 111-113) and 7.14—7.15 (pp. 132-133), and 
especially the pair Figures 7.2 and 7.3 (pp. 112-113). It is further illustrated with the 
figures which follow in the rest of this chapter, as well as the next. 


the matrix is unitary forx € R. (7.5.4) 


Proposition 7.5.2. Let the functions 0, 91, 92, ... be defined by 
(A) g2k (t) =2 >) ajp% 2t —j), 
jeZ (7.5.5) 


(B) gsi ) = 2) (HY ajo At — J), 
jeZ 
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or equivalently by 
(A) . dar (&) = mo (3) ‘ (3). — 
(B) Gat &®) =m G) Ok (5) , 
Suppose k = i; + ig22+---+ i,2"—!. Then 
ser=m (Sma (Some (S)a(S) os 


satisfies 


Yd nr == for dt, fel). (758) 


jeZk=0 


Let (Px)xefo,11 5 the measure on No x Q from Lemma 7.4.3. Then 


Py ({o (k), © D}) = |Oe @ +O), (7.5.9) 


where the correspondence w(k) © k, w (1) © I is determined as in (7.4.14), ie., 
w(k) = (@|,...,@n, 9), and similarly for 1 € No. Moreover, {go (- —) }keno, 1eZ 
is an orthonormal basis for L (R) if and only if 


Py {@(K)}x Z)=1 — a.e.x €[0,1] andall k e No. (7.5.10) 


Proof. The detailed steps are quite analogous to those given in Chapter 2, the main 
difference being that now Lemma 7.4.3 is used in place of Lemma 2.4.1 in the stan- 
dard wavelet case. Oo 


Remark 7.5.3. Note that if k = 0 in (7.5.5)(A), then we recover the scaling iden- 
tity (1.3.1) in the special case N = 2. Depending on the conditions placed on the 
coefficients (a;), we get solutions g = go in various function spaces, or in spaces of 
distributions. If (7.5.5)(A), or (1.3.1), is known to have a solution in L'(R), then an 
integration on R yields >’. a; = 1, and we recover the familiar low-pass property 
for the wavelet filter 
mo(x) = ey aie rr 
jeZ 

in the form of mo(0) = 1. This is an interesting (perhaps unexpected) link between 
wavelets and signal processing. 

The appearance of the graphs of the progression of functions 90, 91, 2, ... 
in Figure 7.5 (pp. 120-121) offers a visually convincing argument for the name 
“wavelet packet” for this cascade construction. As we progress in the series of 
functions from the union of the two figures, we see that each of the functions @,, 
for n sufficiently large, is made up of two separate and distinct visual parts: one is a 
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high-frequency wave, which appears packed or shaped inside a shadow envelope of 
another form with low frequency (i.e., long wavelength). 

The term “wave packet” appears to have been used first by Werner Heisenberg 
in connection with particle-wave duality in quantum theory in the 1930s. The visual 
situation in Heisenberg’s theory was much the same as the one which comes out 
from the much later wavelet constructions. In any case, Heisenberg’s aim for his 
construction in the early days of quantum theory was quite different from what came 
later with wavelets, and Heisenberg most likely did not know about Haar’s wavelet 
from 1910, see [Haal0]. The name “wavelet packet” came much later than both 
Haar and Heisenberg. But it was probably chosen by analogy to Heisenberg’s wave 
packets by the authors of [CoMW95] and [CoWi93]. 


Remark 7.5.4. In (A)}B) in Proposition 7.5.2 and their analogues (1.3.18)+1.3.19), 
we considered the crucial recursive identities only in the scalar case. But in fact there 
are matrix versions. The main modification in the matrix case is that the coefficients 
are then matrices, and the functions involved are vector-valued. However, both in the 
scalar case, and in the matrix case, there is a pair of masking coefficients a, and bx. 
The above-mentioned quadrature conditions (see, e.g., Figure 7.7, p. 124) dictate a 
certain relationship between the two: Both in the scalar and the matrix versions of 
the pair a;, b,, the relationship is 


by = (—1)* Goaaze - (7.5.11) 


Depending on the plan of the picture and the number of a;’s, there are different useful 
choices for the odd number in (7.5.11). 

The pair of frequency response functions which correspond to the assignment 
(7.5.11) made above will then be as follows: 


| mo(z) = >. axz*, 


m\(z) = zit mo(—z). 


7.6 Representations of the Cuntz algebra O2 


The Cuntz algebra On, see [Cun77], is a simple C*-algebra on the relations 
N-1 
S$S,=61, > SSP HL. (7.6.1) 
j=0 


When Oy is realized by operators on a Hilbert space H, we take J to be the identity 
operator on H, i.e., [:h +» h, h € H. These operators are fundamental in signal 
processing, where they take the form of signal/wavelet diagrams as illustrated in the 
series of Figures 7.7—7.15 (pp. 124-128, 132-133), and especially in Figures 7.7 and 
7.14 from signal processing. 
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SS So 
ANALYSIS 1 SYNTHESIS 


low-pass filter dual low-pass filter 


SIGNAL IN SIGNAL OUT 


down-sampling 


high-pass filter dual high-pass filter 


SF S| 


Fig. 7.14. Perfect reconstruction of signals. 


The choice of the numbers (a;) jel used in mo, m, from (7.5.1)}H{(7.5.4) guaran- 
tees that the corresponding two operators 


SGNOQ=V2me@s(2),  1=0,1, FeV, (7.6.2) 


satisfy (7.6.1) for H = L? (T) and N = 2. 
If we use 1-periodic functions on R instead of functions on T, then (7.6.2) takes 
the equivalent form 


(S; f) (x) = V2mi (x) f 2x), 1=0,1, feLl?,1). (7.6.3) 


The two functions mo and m are called the low-pass/high-pass filters respectively, 
and we have mg (0) = 1, m; (1/2) = 1, when the m;’s are viewed as 1-periodic 
functions. 


Lemma 7.6.1. Let the functions mo, m, satisfy the unitarity condition (7.5.4) and 
let S;, i = 0, 1, be the corresponding operators. Then the operator relations (7.6.1) 
of Cuntz are satisfied. 


Proof. We refer to [BrJo02b] for the detailed verification that (7.6.1) is satisfied 
when the unitarity property (7.5.4) is assumed. The essential step is the following 
identity: 


1 
SisvP=isr, sfeV@n. (7.6.4) 
i=0 
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Fig. 7.15. Iteration of the filter/down-sampling operation as it arises in wavelet algorithms. 


To verify (7.6.4), first check that 


isNwo=5 ym (=) 2 (4). eae (7.6.5) 


And so 


Dis 
“rf 


0 


dx 


ym m (22) Er) s (*#*) 
eer h (FP) (FS) (SS) 


Ligeoe! x+r x+s 
(first the i-summation) b: Dds ic (5 dat 2 )as 


r=0 s=0 


ea) 


which is the desired identity (7.6.4). Oo 


1 
de =| If GOP dx =f 17, 
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fas) 

_ 

Ny 

iv) 

QO Q 
| — 
NI N 
es) vs) 
: f=) 

no 

NI 

wo! 

So 

= 

Ny 

vs) 


0 1 2 3 
(a): Father function 


The father function g is also called the scaling function and it is the solution to the scaling 
identity (d) in Example 1.3.3. It is also the first of the sequence 99, 91, 92, ... of functions 
generated by the algorithm (7.5.5), i.e., 9 = 9. 


1 
0} 
-1 
0 iT 2 3 
y 
-1 
0 1 2 3 


(b): Mother function 
The mother function yw is also called the wavelet function, and it is the second term in the 
sequence of functions generated by the algorithm (7.5.5), ie., y = 1. So this four-tap 
Daubechies y is the same function as g; in Figure 7.5 (p. 120). 


Fig. 7.16. Daubechies wavelet functions and series of cascade approximants. 
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We note, see [Jor03], that (7.6.4) is known in signal processing as the quadrature- 
mirror identity for the filter operators defined by the two functions m;,i = 0, 1, or 
equivalently by the sequence (a;) jel: 

The case of four coefficients ag, a1, a2, a3 is called “four-tap”, and it includes 
(with our current normalization) 


1 3 3 3 — 1—¥J3 
we 1+ v3 ee pie An a2 3— V3 ade ce v3 (7.6.6) 
8 8 8 8 
the Daubechies wavelet; see [Dau92, p. 235]. If we take 
1-V3 3-Vv3 3 1 3 
m, (z) = a a aeP + 2 £2 — eee, (7.6.7) 


for z = e~!2™*, then the two functions g (father function) and y (mother function) 
will be supported in the interval [0,3], they are differentiable (see [Dau92]), and 
they graph out as in Figure 7.16. 

We now consider the probability space 


Q= (0, BN, 
and the finite spaces 
Q(n) = (0, EP, 
Specifically, Q (”) consists of all functions w: {1,2,...,}— {0, 1}. 
Lemma 7.6.2. Forn € N anda € Q(n), set 
So := Sal) +++ San), (7.6.8) 

and 

Ee = SoSp => So(1) ayers Soin) we prec Sou): (7.6.9) 


Then {Ew}wea(n) is a commuting family of orthogonal projections satisfying the fol- 
lowing matrix-unit identities: 


EwEo! = 50,0 Ew, a,@ €Q(n), (7.6.10) 
and 
>, fol (7.6.11) 
@eQ(n) 
Proof. This is a direct verification; see also [BrJo02b, Jor05]. Oo 


Theorem 7.6.3. There is a unique projection-valued measure P defined on the Borel 
subsets of [0, 1 ] such that 


o (1) a(n) a(l) a(n) 1 = 
P([SPant i +55 |) = 
foro €Q(n) andneN. (7.6.12) 
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Proof. The result follows from Lemmas 2.5.1 (Kolmogorov’s construction), 2.6.2, 
and 7.6.2 above; see also [Jor05, Jor04b]. Oo 


Remark 7.6.4. In view of (7.6.11), the measure P from Theorem 7.6.3 satisfies 


] 
| P (dx) =1, (7.6.13) 
0 


where J is the identity operator in the Hilbert space 11 which carries the representa- 
tion of the Cuntz algebra O2. Specifically, if f € H, and || f|| = 1, then the measure 


ep ()=(fIPO)S)=1PO) SI? (7.6.14) 
satisfies ; 
ff tus) = np C1) =1. (7.6.15) 
In the case 
H=VNHN=L?0,)220), 


the case f = Il (the constant function 1) was used in [CoMW95] and [CoWi93] in 
the construction of libraries of orthonormal wavelet packets, and it was conjectured 
that the measure 

ey (-) = PC) uIP (7.6.16) 


is absolutely continuous with respect to Lebesgue measure on [ 0, 1 ] for all the wave- 
let filter systems of (7.5.1)(7.5.4). 

We do not solve the conjecture here, but we illustrate the question with the use 
of the random-walk model from Chapter 2. 

We first take a closer look at the measure (7.6.16). Since e, (z) = z”, and eg = 1, 
we shall also use the notation jo for the measure pq. 


Lemma 7.6.5. For the case n = | in Theorem 7.6.3, and the measure uo of Remark 
7.6.4, we have 


(a) Ho ({ 3)) = 22 lap; |*, 
©) wo(] 51) = 2 lanl 


(c) uo ({x}) = 0 for allx € [0,1]. 


Remark 7.6.6. When (a)(b) are applied to the Daubechies wavelet filters (7.6.6)- 
(7.6.7), we get 


m((og))=32 mh) 494 


In the case of the dyadic Haar wavelet, it is easy to check that 
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ow (1) a(n) w()) a(n) 1 25th 
wo(|2 +--+ De ge FO Tp +x) )= 


fora e€ O(n), neN. 


So for the dyadic Haar wavelet, the measure yo is the Lebesgue measure restricted 
to the unit interval [ 0, 1 ]. 


Proof of Lemma 7.6.5. Part (a): 


- oe Pd l x+r 
10 ([0.5)) sytem Isseol” ass 25 2m ( ? ) 


_ 2 
by Parseval oe lay | . 


2 


dx 


Part (b): 


- ae 7 1}, J x+r 
#0 ([5)) dee Stel G5) 2f 2" ( ‘ ) 


ae fa 2 
by Paseand 152) 72 lara" = 2, Jarzay|". 


2 


ax 


Part (c): Here we refer the reader to [BrJo02b, Theorem 2.2.1, p. 92; Lemma 


2.2.3, p. 95]. Oo 
Theorem 7.6.7. Let (a;) jez? MO and m, be as described above. Let n € N, and 
@ € Q(n). Then 
o(1) a(n) w(h) a(n). 1 
wo ([*5 EA ape ag ng 
=) SY C1e@er-toem 
keZ |&,....6n€Z 


© Fe) 426) 4429-18, * Gw(2)-& °° Fon) -&, 


Proof. Forn € N,w € Q(n), set 


m”) (x) = mat) (©) Ma) (2X) -- + Main) (2"-'x) : (7.6.17) 


Then 
(Sof) () = m™ (x) f (2”x) (7.6.18) 


and 
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(SA@=2"27 YL mPoslO) (7.6.19) 


2”-y=x mod 1 


by the argument from (7.6.5). Note that the summation in (7.6.19) is over the set of 
all dyadic fractions 


x+r 
ve an 


r=0, Lee, (7.6.20) 


where each r in (7.6.20) has a unique representation 
r=Gt+Q2t---+G2"1, Ee Qn). (7.6.21) 


Using now Lemma 7.6.5 on m” and (7.6.18), we get 
1 1 1 
pi Ope Oe eee.) 


2” 2 2" Qn 
2 
= |Seo| 
1 1 2 
= 2" [ = > m® (y)| dx 
0 2 2”.y=x mod 1 
2 
_ on (w(1)) ((2))_.. ,(@(n)) 
=2 >: > Fy nag ——n-1e, 4, 
keZ |é,....6n€Z 
Se eae 
keZ |&,...,€n€Z 
2 
© Fy (1) KP 424+ 4.2"-1G,  Fenl2)—Sa * * Fea(n)—En |? 
which is the desired result. oO 


7.7 Representations of the algebra of the canonical 
anticommutation relations (CARs) 


The purpose of the present section is twofold: (1) to illustrate the non-abelian frame- 
work for infinite-product constructions; and (2) to point out how a class of mea- 
sures that are known as determinantal measures (see [LySt03]) arises from a con- 
struction from operator algebra theory—Powers—Stermer’s construction of the quasi- 
free states on the C*-algebra of the canonical anticommutation relations, the CAR- 
algebra; see [PoSt70]. 

The construction for both the abelian and the non-abelian setup begins with a 
fixed operator A in a Hilbert space, such that 4 is assumed selfadjoint with spectrum 
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in the closed unit interval [ 0, 1 ]. In the abelian setting, the measure y 4 is defined on 
cylinder sets (i1,..., in) in Q, as the determinant of the submatrix obtained from 4 
by using the subset of an orthonormal basis corresponding to the selection 7, ..., in. 
As it turns out, “4 is the restriction of a state p4 on the CAR-algebra—a quasi-free 
state, in the terminology of [PoSt70]—-where p, is defined as the Hilbert norm of 
fermion particles. This is a multistate computed from i, ..., i, and computed in a 
multiple-particle Hilbert space which is constructed directly from A. 

In both cases, the abelian and the non-abelian, the extension from the cylinders 
to the full measure space, or the full C*-algebra, is performed with a variant of the 
Kolmogorov extension principle from Section 2.5 above. 

We begin with some terminology and definitions. 


Definitions 7.7.1. Let 442 denote the algebra of all 2 x 2 complex matrices, and set 


WM, = Mn @---@ My = Mar. (7.7.1) 
————— 


n times 


Since 2,4; = M2 (,,), the mapping T tp» : _) defines a natural embedding of 
A, into 2,41, and we get 


Tey CX C++» CA, CAny1 C--- (7.7.2) 


as a non-commutative analogue of (2.4.5) above. 
Let i, j € {0, 1}, let e;,; be the 2 x 2 matrix e; ; (k, /) = 6;,40;,1, and set 
(n) aes Qe; 
é @ Cin, jp @ +++ @ Cin, jy (7.7.3) 


; Lae = 4, ; 
G1. sinsSy---9Jn T15J1 


For w, € € Q(n) set 
E®., = Sy Sf, (7.7.4) 


where S,, is defined from a representation of O2 as in (7.6.8). As noted in [BrJo02b], 
the assignment 

E+ el, (7.7.5) 
identifies a subalgebra (up to C*-algebraic isomorphism) of ©2 with the infinite 
tensor product 


co 
A= ® M; = ind lim Ap. (7.7.6) 
The C*-algebra 2 in (7.7.6) is the inductive limit of the system (7.7.2) of matrix alge- 


bras, and 2 is also called the C*-algebra of the canonical anticommutation relations 
(CARs) for reasons which are spelled out in [PoSt70]. 


The purpose of this section is to demonstrate that a family of measures which 
were studied independently in [PoSt70] and in [LySt03] for very different reasons 
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may be viewed under the heading of this book, i.e., measures defined from functions 
Q — (matrices), or inductive limits of systems Q (n) — (matrices), for n € N. 

Let H = €2(N) be the familiar Hilbert space of £?-sequences, and let (e i); eN 
be the standard ONB in H, i.e., defined by ¢; (k) := d;,4, j,k € N. Let 4:H -— H 
be a linear operator such that 


0<(h{Ah)<|Al?, AEH. (7.7.7) 
The matrix representation for A will be defined relative to the ONB (¢;), i.e., by 
A(i, j) :=(& | Aes), ijeN. (7.7.8) 


Let QO = {0, 1, and define the matrix function W on Q by 
W (@) GA) = (04,7 +-D (6,7- AGI). 7-79) 


For the C*-algebra 21 we shall need the following representation from [PoSt70]: 
there is an antilinear mapping a: 71 — H such that 


a eee aa al (7.7.10) 


ath)a(k)+a(k)ath) =0 for allh,k € H. 
A State p on 2 is a linear functional p: 21 + C such that p (/) = |, and p (T*T) > 0 
for all T € 2. 


Let A be a fixed operator as specified in (7.7.7). It is known [PoSt70] that there 
is then a unique state p = p4 on 2 such that 


pa(a (ei,)* ee (ei,)" a (ej,) +++ (jn) 
= Onm det (A Gir, Js)icrsen- (7.7.11) 
When the CARs (7.7.10) are used, we get the formulas (7.7.3) which are used in 
the embeddings (7.7.2). As a result, we get the following formulas for the matrix 


elements in the algebras in (7.7.2), with the superscript referring to the m’th tensor 
slot (see [PoSt70]): 


(n) * (n) 
Cnn = alen)alEn)”, Cf; =a(En) Vn-1, 
| 0,0 (En) a (En) 0,1 (En) Vn (7.7.12) 


ef" = 4 (En)” Va-1; ef” = a(n)" a(En), 
where Vo = I and 


Va = I] (I — 2a (ex)* a (ex)). 
k=1 


Using the diagonal matrices from (7.7.12), it is now clear that C (Q) is naturally 
embedded in 2, and therefore in O2 as well; see (7.7.3)(7.7.5). 
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Theorem 7.7.2. Let A be a given operator satisfying (7.7.7), and let W = W4 be 
the corresponding matrix function (7.7.9) defined on Q. Ifn € N, anda € Q(n), 
then W (@) is ann x n complex matrix, and we set 


P™ (@) = det W (a). (7.7.13) 


Then the family (P™) en 8 Kolmogorov consistent, see Lemma 2.5.1, and there- 
fore extends to a Borel measure P4 on Q. This measure is uniquely determined by 
(7.7.13), and is the restriction to C (Q) of the C*-state p4 of (7.7.11). Moreover the 
state is given uniquely by the two formulas 


pa (a(ei,) a (6i,)" ---a (¢;,) a (ei,)") = det (In — Pe, APE) (7.7.14) 


and 


pa (a (en) a (e,)---a (€;,)" a (€%,)) 


= det (In — Pg, (I — A) Pr,). (7.7.15) 


Proof. We know that the state p4 on 21 is well defined, and determined uniquely by 
(7.7.11). Let C (Q) be embedded in 2 as described, and let p4|c(qy be the restriction. 
Clearly the restriction is positive, and so determines a Borel measure on Q by Riesz’s 
theorem. 

We claim that this measure is the same as the one coming from the family 
(P™) ex in (7.7.13); see also (7.7.9). 

Let n € N, and @ € Q(n). Set Ey = w~! (0) and E) = w! (1). Then EgUE| = 
{1,..., 7}. 

An application of (7.7.11) and (7.7.10) yields the following: Let 


Eo = {ii,...,ir}, Ei =(t/1,.--5Js}, 
and let Pz,, resp. Pz,, denote the corresponding orthogonal projections. Then the 
two formulas (7.7.14)-(7.7.15) hold. Moreover, an inspection shows that (7.7.14)— 
(7.7.15) determine the state uniquely. In view of the identification of Q as the matrix 


diagonals in (7.7.12), the two formulas (7.7.14) and (7.7.15), taken together, prove 
that the restriction of p,4 satisfies the desired conclusion (7.7.13). o 


Exercises 


Organization. This chapter is longer than the preceding one, and it has more ex- 
ercises. They are organized as follows: The first seven exercises serve to link the 
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operator theory from inside the text with matrix algorithms. As already stressed in 
Chapter 1, wavelet bases exist abstractly in an ambient Hilbert space H (which could 
be L? (IR¢)), but for computations, Hilbert spaces as such are not very useful: we 
need to make a connection from these function spaces to some concrete sequence 
spaces, i.e., €?-spaces V, and to efficient matrix representations of the objects and 
transformations in those spaces. The basic idea of a multiresolution wavelet basis is 
nothing more nor less than an efficient way to make that connection. 

A wavelet transform is an operator which relates the sequence space V to a 
subspace Yo of the ambient Hilbert space H. With such a choice the subdivision 
operators of wavelet geometry are turned into slanted matrices, which we saw are 
(computationally) fast. 

Exercises 7.4-7.7 are a guided tour of the matrix computations connecting the 
discrete wavelet transform with “real” wavelet computations; see the material in 
Section 7.6 and Chapters 5 and 8. In a simple case we see that the distribution of 
expansion coefficients for a given wavelet packet is determined by the measures 
from (7.6.12) and Theorem 7.6.7. Recall that our starting point is a discrete wavelet 
transform (7.6.1). As outlined in Lemma 7.6.1, it takes the form of a system of oper- 
ators S; in the Hilbert space L? (T), or equivalently in €?, while in contrast a wavelet 
basis refers to functions in L? (IR). In computations, not all Hilbert spaces are equal. 

The second block of exercises deals primarily with tensor products of Hilbert 
space, and the associated geometry for operators and for matrices. One of these, Ex- 
ercise 7.8, about image processing, serves to link two ideas, slanted matrices and 
tensor products: For images, i.e., 2D, or other problems in higher dimensions, a pop- 
ular approach is to build the models simply by taking tensor products of well chosen 
1D wavelet algorithms. 

The last one, Exercise 7.12, is about representations of the C*-algebra Oy for 
N fixed. It stresses that in the present context the representations of Oy are more 
important than Oy itself. 


Notational conventions for slanted matrices. The slanted matrices used in this 
chapter, and later, are infinite-by-infinite, in fact doubly infinite, i.e., with rows and 
columns infinity in both positive and negative directions. Specifically, both rows and 
columns are indexed by the integers Z. That is, the counting of rows and columns 
starts in the negative segment of Z, going up to 0 in the middle, and then continuing 
the counting with 1,2,..., with three dots indicating recursion. Rows are counted 
from top to bottom, and columns from left to right: So, in each case, the row index 
increases as we move from top to bottom in a matrix display; and the column index 
increases from left to right, consistent with the counting of columns. The matrices 
have uses in computation, where their slanted feature is significant. The slanting of 
these matrices reflects the subdivision (from scaling) built into the algorithm. 

In computations, we rely on matrix multiplication but choose our index conven- 
tion to be consistent with the associated operator formalism, i.e., with multiplication 
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of the corresponding operators in Hilbert space. An operator and an orthonormal ba- 
sis (ONB) will always be specified; typically the ONB is chosen to be the standard 
canonical basis in €? (Z). But we will have occasion to use more general bases. 

Our index convention means that the row/column position (0, 0), i.e., zero row 
and zero column, is in the center position of each of the slanted matrix displays. We 
have used shading to illustrate this point: The matrix entries in the center cross, with 
crossing at (0, 0) is highlighted, meaning that the entries in both row 0 and column 
0 are highlighted with a light shading. This is to help the reader see how matrix 
multiplication is implemented; and it further serves to call attention to the slant in 
each of the slanted matrices, and to visualize how slanting changes as matrices are 
multiplied, or as the scaling number changes from one matrix to another. 


7.1. Let 2 be a complex Hilbert space with inner product (- | -), and let A be an 
index set. Two systems of vectors in H, (€n)ne4 and (En) ne 4, are said to form a dual 
basis system, also called a bi-orthogonal basis in H, if the following three conditions 
are satisfied: 


(i) (én | €m) = On,m for all n,m € A, 
(ii) (en | f) =O0Vne A> f =0inH, 
Gil) (fle) =0VnEe ADS f=O0ink. 


(a) Prove the following assertion: If (én), (@m)) is a dual basis system, then 
every vector f € H has two representations, 


f= >a l fien= > lenl S)&- 


neA néA 


(b) Make precise the notion of convergence used in (a), and supply a detailed 
epsilon-delta argument. 


(c) Prove that if (€n),<¢4 is a system of vectors in H, then (€,)y¢4 is an orthonor- 
mal basis (ONB) in H if and only if the pair ((n)ne4s (€n)ne 4) is a dual basis 
system, i.e., if and only if we get a dual basis system by taking @, = e, for all € A. 


Definition. Let (€,),<4 be a system of vectors in H. If there are two constants c,, 
c2,0 < cy < cz < 0, such that 


alfP <> kel f)P<cllfl? forall fen, 
neA 


then we say that (€n),<4 is a frame basis (or just a frame) for the Hilbert space, and 
c1 and cz are called frame bounds. 


(d) Suppose (€n),<4 1s a frame. Then show that there is a system (€n)ye4 such 
that ((en)ne4 + (En)ne) is a dual basis system. 


(e) If (€n)ne4 is a frame for H, show that the operator T:H — €? (A) defined 
by TF = (en | f))nea for f € H is bounded. 
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(f) Show that the adjoint operator T*: €7 (A) > H is given by the formula 


T* (@n)nes) = ys Xn€n- 


neA 


(g) Show that 7* 7: — 4 is invertible, and that 


rf = > (en i f)en forall f EH. 


néeA 


(h) Setting &, := (T*T)~! e, for n € A, verify that the two reproduction formu- 


las 
f= >i (enl fen => len | fen 


neA ned 


hold for all f € H. 


7.2. Let H and K be two complex Hilbert spaces, and let T: 71 — K be a bounded 
linear operator. Let (€n)ne4 and (fm)mep be ONBs for the respective Hilbert spaces 
H and Kk. Hence we have the two representations 


h= > xnen with x, = (e, |h) forh € H, 


neA 


Th= >" Ymfm with Ym = (fm | Th). 


meB 


A matrix (T (n, k))nep, ke 4 iS Said to represent the operator T in the two bases if 


Yn = DoT (1, xk forn € B. 
keA 


Show that when the two ONBs are given, then the matrix representation is 
unique, and 
T (n, k) = (fn | Tex). 


7.3. Generalize the result in Exercise 7.2 for 7:7 — K where ((en) , (€n)) is a dual 
basis system in H, and ((fm). (fm)) is a dual basis system in K. Specifically, show 
that then the formula T (n, k) := ( fy | Tex) provides a matrix representation for 7 
relative to the two basis systems. 


7.4. Consider the Hilbert space H = L? (T) (= €? (Z)), and let e, (z) = z*, k € Z, 
be the usual Fourier basis. 
Show that (e,);ez is an ONB for H. 


7.5. Letm € L° (T) and consider the operator 


(Sf) @) =V2m@) (2), feH, zeT, 
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and its adjoint operator 


(SN =] livz) s(V2) +m (VE) F(-VE)), fe eT. 


Va 


Let m (z) = >°,e7 4xex (z) be the Fourier series for m. 
(a) Now show that the matrix representations for the two operators S and S* 
relative to (e,),<z are as follows: 


S(n,k) = V2an-% and S*(n,kK)=VJ2a%-m — forn,k € Z. 


(b) Show that the two matrices in (a) have the following “slanted form”: 


; .- k=0 
S=~v2 n=0 (7E.1) 
and 
x = k=0 
| 
F:=S*=¥v2 n=0 — (7E.2) 
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7.6. In the case of only two non-zero terms, say ao and a), the matrix F from (7E.2) 
takes the form 


L 


*x*0000000000000000 
00lx* 0000000000000 0 
000 0i\** 000000000000 
00000 0i\**0000000000 

n=0 10000000 0i\*X «00000000 |. 
000000000 0lex *000000 |’ 
00000000000 0jJ* «0000 
0000000000000 0]* *« 00 
0000000000000 0 0 O|x « 


and hence the term “slanted Toeplitz” matrix. 

(a) Identifying the oo x oo matrix with the corresponding transformation in €? (Z) 
relative to the standard basis in €? (Z), find the structure of the invariant subspaces 
for F. Make an infinite list of finite-dimensional subspaces which are invariant under 
F, 

(b) Answer the same questions as those listed in part (a), but now in the case 
of only four non-zero terms (the four-tap case); say ao, a1, a2, and a3 non-zero, but 
a, = O0ifk € Z\ {0, 1, 2, 3}. 

(c) In the four-tap case from part (b), find some finite-dimensional subspaces in 
€? (Z) which are F-invariant and minimally F-invariant. 


7.7. Suppose the matrix F has the form from (7E.3) in Exercise 7.6 corresponding 
to the two-tap case. Then show that F? has the following shape: 


Hint and discussion: From the theory inside the chapter, the reader will notice 
that there are two numbers which determine the shape of the pair of matrices S and 
F which occur in Exercises 7.5 to 7.7: First there is the number of taps T (two-tap, 
four-tap, six-tap, etc.), and secondly there is the scaling number. The scaling number 
is denoted N, and for the dyadic wavelets, N = 2. Each operator, alias matrix, S 
and F has two lives: first it is represented as an operator in a function space, and 
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secondly we have outlined how it is represented as a matrix, albeit an infinite-by-in- 
finite matrix. If S is represented as an operator acting on functions (see Exercise 7.5, 
N = 2), it is composition with z” followed by multiplication with a function of z. 
In each incarnation, we may talk about an adjoint F = S*: In the operator world, 
the adjoint refers to the inner product of the Hilbert space of functions on the circle 
T, and in the matrix picture, the adjoint is the usual matrix adjoint, i.e., there the 
adjoint matrix arises simply as transposed and complex conjugate. We use the same 
notation in either case, namely F = S*, with the + denoting adjoint. The amount 
of slanting (i.e., shifting of the terms from one column in S to the next) is equal to 
the number 'V. But the matrix transposed explains why the direction of the slanting 
changes from formula (7E.1) for S to (7E.2) for F in Exercise 7.5. The direction 
of the slanting is significant for computation, as it turns out. As already noted, the 
number of taps T is even, 2, 4, 6, etc., and it is simply the length of a minimal band in 
a column of S, or equivalently a row in F. All the terms outside a band are zero, and 
minimal means “the shortest” such band. In the present discussion, we will arrange 
the bands so the index runs from 0 to T — 1, so four-tap corresponds to a minimal 
band {ag, a1, a2, a3}. Using the function representation, it is easy to see that if S had 
tap number T and scale number N, then S? has tap number T + N(T — 1), and scale 
number N?. For the present discussion (Exercise 7.7), N = T = 2. Hence, in this 
case, both the tap number and the scale number for S? are 4. 


Images. The next multipart exercise is about images. It involves two features: 
First, it is a recursive matrix algorithm, and secondly it is a tensor-product con- 
struction. The environment of images is 2D, and the tensor product of two sepa- 
rate 1D algorithms thus serves to create a new 2D algorithm. In fact this simple 
tensor-product construction is the easiest way of producing the type of 2D wavelet 
algorithms needed for images. 


7.8. (a) Search the internet for your favorite visual sequence created with a digital 
camera, and illustrating image processing for a digital portrait-photo; for example of 
Lena [WWW5]. Look closely at subdivisions of pictures, and compare the cascading 
successions. Identify how the pictures change from one subdivision step to the next 
in the cascade. 

(b) Having understood the pictures on the web, then compare with the algorith- 
mic resolution steps in Figures 7.9 and 7.10 in this chapter (pp. 126-126). 

A portrait (such as Lena’s) consists of numbers arranged into matrix forms: only 
one matrix is needed for recording of grayscales, but several matrices are used for 
digitizing color images. An exposure with a digital camera imprints numbers for 
each pixel. Pixels form checkerboard arrangements which in turn correspond to a 
chosen resolution. However a digital camera creates both scales of resolution and 
intermediate differences; and everything is digitized. 

So a picture L which is chosen and fixed at the outset will be represented by a 
vector in the subspace Vo from Figure 7.10. Since subdivision in the planar picture 
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is made with horizontal and vertical lines, the first algorithmic analysis step applied 
to Z (in vector form) will split it up into four pieces, ie., N = 4. 

(c) In the image of Lena (L) on the Web, identify the corresponding four subpic- 
tures. They are usually arranged inside a square, with the NW scaled “quarter-box” 
showing the same image L but in a slightly more blurred resolution. Identify how 
the blurring corresponds to “local averages.” The other three scaled pictures repre- 
sent differences in grayscale (for black/white) in three directions, horizontal, vertical, 
and diagonal. 

In the Hilbert-space representation, each of the four subsquares of a picture is a 
component in the corresponding four subspaces V; (the coarser resolution), and the 
three intermediate spaces wi, W 2. and we ) from Figure 7.10. 

(d) Work out that the three W-subspaces are registers for intermediate differences 
in grayscales. 

(e) Finally, from the pictures on the web, identify how this is the start of a cas- 
cading process. Show how the process continues on the part of Z in the space /,. As 
you follow the pictures on the web, note that the process is repeated, and at each step 
the part of Z in the NW subsquare is further subdivided. Check that this algorithm 
corresponds to the pyramid which is illustrated with Figure 7.9, for NV = 4. 


Note: The following citations throw light on the use of wavelet algorithms/filters 
in 2D image processing: [Song05] and [StHS+99]. Dr. Myung-Sin Song has created a 
sample of images which are decomposed this way using the wavelet system of Haar, 
plus the analogous image decompositions using Coiflets and Symlets. The processing 
is posted at [Song0S5a] and the images themselves at [Song05b]. 


Tensor products. The next three exercises, 7.9, 7.10, and 7.11, highlight tensor 
products of Hilbert space, both finite and infinite tensor products. Each is composed 
in a multipart format, so that it is easier for the student to work through the concepts 
step by step. The exercises reinforce fundamental concepts already used in Chap- 
ter 7 above; and at the same time, they motivate and introduce some essential facts 
(such as Schmidt’s decomposition in its infinite-dimensional variant) for tensor prod- 
ucts that we will need in the remaining chapters. The tensor-product idea for Hilbert 
space is central in our present basis constructions and in our analysis in general in 
that it lets us use Hilbert-space geometry to reduce or to factor “complicated” data 
into its simpler forms or factors. The reader is encouraged to compare Schmidt’s de- 
composition (Exercise 7.11(e)) with the closely related formulas from Exercises 2.7 
and 9.8 for the Karhunen—Loéve decomposition. 


7.9. Let Hj, i = 1, 2, be Hilbert spaces, and consider tensors f; ® fo, fi € H;. Set 


(fi®folgi@g2):=(filgi)(fhige). 


(a) Show that this formula extends by sesquilinearity to define an inner product, 
also denoted (- | -), on the space of all finite sums 
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k 
Deaf e—, cxcec, fP en. 
k 


(b) Spell out the completion step which turns the tensor product 71; ® H2 into a 
Hilbert space. 

(c) Let A and B be index sets, and consider two indexed families of vectors 
fa € Hi, 88 € Ho, a € A, B & B, such that (fa)ge4 is an ONB in 7; and (84) per 
is an ONB in H. Then show that (f, ® 8B) (a,p)eAxB is an ONB in H, ® Ho. 

(d) Conclude from (c) that the tensor-product formula 


£2 (4) @ € (B) = (A x B) 


holds for the respective £?-sequence Hilbert spaces. 

(e) Let (X;, B;, uj), i = 1, 2, be two measure spaces, and let B be the a -algebra 
generated by all sets in _X, x X2 of the form E; x Ez where £; € B;. Verify that the 
product measure ‘= [1 X [2 is determined uniquely on B by completion and the 
ansatz 

M(E, x £2) = 4) (£1) #2 (£2). 

(f) Let (X;, Bj, ui), i = 1,2, and yw be as in (e). Then prove that there is a 

“natural” Hilbert-space (isometric) isomorphism 


L? (X1, 1) @ L? (Xo, wo) SL? (X1 x X2, uw). 


(g) For i € N, consider compact spaces X; with Borel probability measures j1;. 
Using Kolmogorov’s construction (Lemma 2.5.1), carry out the determination of the 
infinite-product measure 


w= [ui onX:= | |X. 


ieN ieN 


n oO n 
Hint: u [| & x I] Xk =|), E; € B;. 
i=l k=n+1 i=] 

(h) Extend the tensor-product formula from (f) above to infinite tensor products. 
7.10. Infinite tensor products 

Let H be a Hilbert space, and let (h,),<n be an infinite sequence of vectors such 
that ||, || = 1, and 
e lim |, —h|| = 0 for some h € H, 

n>CO 

[e.@) 

© > Mtn — Anil] < 00, 

n=} 


and 
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CO 

>| {arccos |( fn | h)\}* = 00. 

n=1 

Consider infinite tensors 7° /& = fi @ f2 ®--+ where all but finitely many of 
the factors /; agree with the unit vectors /,, i.c., from a finite step onwards, f;, = hy 
holds. On such tensors, define the inner product 


(a a)=T]oaian 


1 k=l 
(a) Use a completion argument to show that this ansatz yields a Hilbert space 
which depends on the chosen system (h,). 
(b) Give a formula for comparing two Hilbert spaces H&, (hx) and H&, (hj) 
constructed this way from sequences (hx) and (hj) of unit vectors, both satisfying 
the three conditions stated before (a). 


7.11. Let H;,i = 1,2, be Hilbert spaces, and let C be a conjugation in 711, i.e., C is 
conjugate linear and satisfies C2 = J. Let w € H; ® Hg, |l y|| = 1, be given. 

(a) Show that there is a unique bounded linear operator K := Ky:H2 > Fy 
such that 


YIChA @ fdmen, =(fi| Kh), » for all fj; € Hj, i = 1,2. 


(b) Show that the operator T := K*K:H2 — 72 is of trace class with trace one, 
i.e., that for every ONB (54) <4 in Ho, we have 


> (4a | Tha) = So Koa? = 
aca aeA 


In particular, justify this convergence. 
(c) Use the spectral theorem to show that there is an ONB (6a)g¢4 in 712 and 


numbers A, > 0 such that 
K*K =) a ba) (ba 


aed 


(d) Let K = W(K*K)!/” be the polar decomposition of the operator K with W 
denoting the partial-isometry factor. Then prove that 


Kbg =JVdqgWha  foralla e A. 


(e) Schmidt’s decomposition (also called Schmidt’s theorem or Schmidt’s fac- 
torization). Let yw be given, y € 71; @ Hp as in (a), and let C, W, Ag, and by be as 
specified in (c}{d). Then prove that 


v = > Vaa CWbq ® ba. 


acd 
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This last exercise shows that Kolmogorov’s extension principle from Lemma 
2.5.1 generalizes naturally to the non-commutative setting that we encountered in 
our use of the C*-algebra Oy. The relations which define Oy are essential for wave- 
let algorithms, and for our representation of corresponding processes from subband 
filtering. The formulation below is adapted from [BrJK W00]. 


7.12. First some notation: Let V € {2,3,...} and let Zy bea set of N elements. Let 
Qs be the set of finite sequences (71, ..-, im) where ix € Zy andm é {1,2,...}. 
We also include the empty sequence @ in Qgn, and denote elements in Qgn by 


U,J,.... TEI = (i,..-,im) € Qan andi € Zy, we let Ji denote the element 
(41, .--54m, i) in Qgn, and s7 = $;,5;, --- Si, € On and s; = 5i57, aa, E€ On. 


In particulars, = 1 =s%. 
Let N e {2,3,...}. Show that there is a canonical one-to-one correspondence 
between the following objects. 


(a) States P on Oy. 
(b) Functions C: Qgn x Ogn — C with the following properties: 
(i) C (2, B) = 1, 
(ii) for any function €: Og, — C with finite support we have 
Dredg, FDC (I, JE (J) = 0, 
(iii) Miezy C Ui, Ji) = C U, J) for all J, J € Qfn. 
(c) Unitary equivalence classes of objects (K, v9, V1, ..., Vn) where 
(i) K is a Hilbert space, 
(ii) vo is a unit vector in K, 
(iii) Vi,..., Vn € BK), 
(iv) the linear span of vectors of the form Vv, where I € Qfin, is dense in K, 
(Y) Dieay ViVi = We. 
The correspondence is given by 


(d) P (s;s%) =C U, J) =(Vj'v0 | Vj00). 


References and remarks 


Images and higher dimension 


The early 1990s brought rapid advances on two fronts: wavelets in mathematics 
and image processing in engineering. The engineering applications include both the 
projects going into compressing and converting fingerprints into digital files [Bri95], 
and also the creation of processes used in chips in digital cameras. 

On the mathematical side, we note in [Dau92] the use of tensor products in trans- 
ferring 1D wavelet algorithms into higher-dimensional analogues. This was sug- 
gested in Daubechies’ classic book [Dau92]. In fact our discussion of this in the 
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general context of Hilbert space (see Figure 7.10, p. 126, and Exercise 7.8 above) is 
close to the ideas in [Dau92, Chapter 10]. 

Daubechies suggests the use of wavelets for image processing on page 313 in 
{[Dau92]. She first uses the tensor-product construction in passing from problems 
in 1D to 2D, and then gives a geometric interpretation of the recursive splitting of 
resolution subspaces of L? (R’) as an iterated subdivision of squares. With this con- 
struction, elements of subsquares simultaneously play two separate roles: On the one 
hand, (1) vectors in cascades of subspaces of an ambient Hilbert space, as well as 
(2) the digital components of an image represented in grayscale and processed by a 
2D wavelet algorithm. She then reproduces a visual rendition in [Dau92, Fig. 10.3, 
page 316], and she credits this image-cascade to M. Barlaud. 

Other pioneering mathematical presentations involving 2D wavelet algorithms 
include [GrMa92] and [LaRe91]. Several results from [LaRe91] were later extended 
and appeared in the next generation of papers such as [LaLS96] and [LaLS98]. (We 
thank W.M. Lawton for explaining this to us.) 


Representations of the Cuntz algebras 


The idea of using the representations of the Cuntz algebras O,, in the study of iter- 
ated function systems (IFS) and more general fractals may originate with the paper 
{JoPe96] by Steen Pedersen and the author. 

The figures in this chapter (and elsewhere) were created by Brian Treadway with 
Mathematica, and the Mathematica program used for Figure 7.5 (pp. 120-121) has 
been included below. 


paO[\[Theta}]_] \rcolon= (1/4) - (1/4) (Cos[{\{Theta]]) 
- (1/4) Sin[\[Theta] ] 
pal[\[Theta]_] \rcolon= -(1/4) (-1 + Sin[\[Theta]]) 
+ (1/4) Cos{\[Theta]] 
pa2[\[Theta]_] \rcolon= (1/2) + (1/2) Sin{\[Theta]] 
pa3{\[Theta]_] := -(1/2) Cos[\[(Theta]] 
pa4[\[Theta]_] := (1/4) + (1/4) (Cos{\[Theta] ]) 
- (1/4) Sin[\[Theta}] 
pa5[\[Theta]_] := (1/4) (-1 + Sin[\[Theta]]) 
+ (1/4) Cos{\[Theta] ] 
aO{\[Theta]_] := paO{\[Theta] + (\[Pi]/2)] 
- pad(\{Theta] + (\[Pi]/2)] 
al[\(Theta]_] := pal[\[Theta] + (\[Pi]/2)] 
+ pa4[\[Theta] + (\[Pi]/2)]} 
a2[\[Theta]_] := pa2[\[Theta] + (\[Pi]/2)] 
- pa3[\[{Theta] + (\[Pi]/2)] 
a3[\[Theta]_] := pa3[\[Theta] + (\[Pi]/2)] 
+ pa2[\[Theta] + (\[Pi]/2)] 
a4[\[Theta] ] := pa4[\[Theta] + (\[Pi]/2)] 
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- pai[\{Theta] + (\[Pi]/2)] 
a5[\[Theta]_] := pa5[\[Theta] + (\[Pi]/2)] 

+ paO[\[Theta] + (\[Pi]/2)] 
loctwont[\[Theta]_] := N[Transpose[{{a2[\[Theta]], 

aO[\[Theta]]}, {a3[\[Theta]], al[\[Theta]]}}]] 
cascadestep[(phitable_, \[Theta]_] := Flatten[Partition|[ 
Flatten[{0, phitable, 0}], 2, 1] . loctwont[\[Theta]]] 
correlatewavelet[phistart_, itercount_, \[Theta]_] := 
Flatten[Transpose[Partition[PadRight[ 

Nest [cascadestep[#, \{Theta]] &, phistart, 

itercount], 6 (2°itercount)], (2°itercount)]]] 
avec[\[Theta]_ ] := {a0[\[Theta]], 

al[\[Theta]], a2[{\[Theta]], a3[\[Theta]]} 
signavec[\[Theta] ] := {-a0[\[Thetal]], 

al[\[Theta]], -a2[\[Theta]], a3[\[Theta]]} 
ABstep[wltsc_,\[Theta]_] := {ListConvolve[avec[\[Theta]], 

wltsc, 1, 0], ListCorrelate[signavec[\[Theta]], 

witsc, -1, 0)} 


unscramble[wltsc_] := Flatten [Transpose[Partition[ 
wltsc, 6]]] 
rescramble[wltsc_] := Flatten[Transpose[Partition[ 


Partition[Partition[PadRight[wltsc, 2 Length[wltsc]], 
2], 3], Length[wltsc] / 6], {3, 2, 4, 1}]] 

startit = 7 

philevelcount = 5 

currenttheta = -5 \[Pi]/6 

firstphilevel[phistart_, itercount_, \[Theta]_] := 
Map [ABstep[#, \[Theta]] &, correlatewavelet[ 
phistart, itercount, \[Theta]], {-2}] 

nextphilevel[wlttree_, \[Theta]_] := Map[ABstep[#, 
\({Theta]] &, Map[rescramble, wlttree, {-2}], {-2}] 

Table[Show[Graphics[Map[Point, Transpose[{Table[i (27 ( 
-~(startit + philevelcount))), {i, 1, 3 (27 (startit 
+ philevelcount))}], Flatten[Map[unscramble, Nest[ 
nextphilevel[#, currenttheta] &, firstphilevel[{1}, 
startit, currenttheta], philevelcount - 1], {-2}], 
philevelcount - 1] [{f]]}]], {AspectRatio \[Rule] 
Automatic, Axes \[Rule] {True, False}, Ticks \[Rule] 
None, Frame \({Rule] True, ImageSize \[Rule] 74, 
FrameTicks \[Rule] {{0, 1, 2, 3}, {-2, -1, 0, 1, 2}, 
None, None}, PlotRange \[{Rule] {{-0.125, 3.125}, 
{-2.5625 , 2.5625}}}]], {f, 1, 32}]1; 


We offer the following minimal list of relevant references (books and papers): 
covering multiresolutions, [Jor03, BaCM02, BaMM99, BrJo02b]; multiwavelets, 
[BaMe99]; and wavelet packets, [CoWi93, Wic93, Wic94]. In our discussion of 
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families of filters, high-pass and low-pass, in Remark 7.4.4 above, we rely on an 
operator-theoretic formulation which is only implicit in the formulation we offer 
here. The operator-theoretic approach uses certain representations of C*-algebras, 
the Cuntz algebras Oy, N = 2,3,...; see [Cun77]. However, this representation- 
theoretic viewpoint is stressed more, and made explicit, in our research papers 
[BrJo02a, BrJo02b, Jor03, Jor05, Jor04b]. 

Combining formulas (7.7.9) with (7.7.14)}+(7.7.15), we see that the measures 
from Theorem 7.7.2 are simply restrictions of certain quantum-mechanical states 
to the canonical maximal abelian subalgebra D in the CAR-algebra 2, i.e., the C*- 
algebra 2 on the canonical anticommutation relations, also called the fermion alge- 
bra. The subalgebra D in 2 is simply the C*-subalgebra in the CAR-algebra which 
is generated by the diagonal matrix elements from the list (7.7.12), i-e., by the matrix 
elements eo when nv runs over No andi = 0, 1. 

While the formulas (7.7.14){7.7.15) appear to define 4 only on D, i.e., ap- 
pear to only define a measure, the paper [PoSt70] makes it clear that the measure 
is a restriction of a unique state p 4 on the CAR-algebra 2, the so-called quasi-free 
states. These states on the CAR-algebra are called quasi-free states in [PoSt70]. In 
fact [PoSt70] initiated an important study (in operator-algebra theory) of a notion 
of equivalence (called quasi-equivalence) of these states (generalizing the more fa- 
miliar notion of equivalence of measures—two measures are said to be equivalent 
if they are relatively absolutely continous). The results of Powers—Stermer [PoSt70] 
are motivated in turn by seminal work of Kakutani [Kak48] on equivalence of infi- 
nite-product measures. 

The definition of the C*-algebra 21 on the CARs simply dictates the known re- 
lations that arise in models of infinite systems of quantum particles called fermions. 
In fact, the CARs axiomatize Pauli’s exclusion principle for the fermion particles, or 
fermion gases. At the same time, the CAR-algebra 21 models infinite lattice systems 
in quantum statistical mechanics; see [Rue69]. The framework applies independently 
of the lattice dimension, and it allows the development of a mathematical formula- 
tion of equilibrium states that does not rely on a chosen way of taking limits of finite 
subsystems. 

More generally, the quasi-free states on the CAR-algebra are important in physics. 
They are widely used in the study of Fermi gases, and they have played a role in sta- 
tistical mechanics and in axiomatic quantum field theory over the years; see, e.g., 
{Hal83] and [Rue69]. 

Independent of the long history of the study of states on the CAR-algebra, the 
measures from the conclusion of Theorem 7.7.2 have emerged in statistics and com- 
binatorial probability theory under the name “determinantal measures;” see, e.g., the 
paper [LySt03] by Lyons and Steif, and the papers cited there. Lyons et al. and Di- 
aconis et al. [DiFr99] used the determinantal measures in giving a new independent 
proof of Szego’s limit theorem, and in further extending the scope of Szego’s theo- 
rem to cover applications to infinite particle models in physics. 


References and remarks 155 


The determinantal measures have especially attractive ergodicity and monotonic- 
ity properties; see [LySt03]. In fact, they have been used in other research, e.g., in 
new proofs of rigorous results on Fredholm determinants in the theory of orthogonal 
functions, and in the representation theory of infinite-dimensional unitary groups; 
see the references in [DiFr99]. 

We saw in this chapter that there are specific non-abelian operator algebras whose 
representations are of significance for wavelets, fractals, and for the kind of subband 
filters which are used in both signal processing and in the study of wavelets. Specif- 
ically, we studied for each N the Cuntz algebra Ow and two of its distinguished 
subalgebras. In the special case of N = 2, one of the important subalgebras of Oz 
is known as the CAR-algebra. Here we are more interested in particular classes of 
representations of these C*-algebras than in the C*-algebras themselves. The names 
for the various C*-algebras are current lingo in the theory of operator algebras and 
in physics. It turns out that the C*-algebras we encountered in this chapter have 
been used for a long time in mathematical physics (especially in statistical mechan- 
ics [Rue69, Rue89]), where their representations play a role in the study of infinite 
particle systems. Ruelle’s paper [Rue89] was the one which introduced the transfer 
operator, the operator which now goes under the name of the Perron—Frobenius— 
Ruelle operator. As for the C*-algebras themselves, see, e.g., [Dav96] for a friendly 
and current treatment. 

Unfortunately technical jargon differs from one field to the other. To reduce this 
linguistic confusion, we have collected a “translation guide,” i.e., a separate glossary 
(pp. xvii-xxv above). Readers not already familiar with operator algebras might find 
Davidson’s little book [Dav96] to be an agreeable introduction. 

However, we wish to stress that the representations of the Cuntz algebras Oy 
play a role in a rich variety of applications, most of which are outside the scope 
of this book. Even within electrical engineering itself, signal processing is not the 
only place where the Cuntz relations show up. They play a crucial role in systems 
theory as well. Indeed, it has been shown by Joe Ball and his coauthors (see, e.g., 
[BaCV05] and [BaVi05]) that multidimensional linear input/output systems and their 
scattering theory may be couched elegantly in terms of representations of the Cuntz 
algebras Oy. 

The central idea in this chapter may be encapsulated in Figure 7.7 (p. 124). While 
this diagram is from engineering (signal processing), we have stressed that various 
equivalent forms of it have come up quite independently in mathematics, and for 
entirely different reasons. We believe that the interdisciplinary connections stressed 
here have great potential, but apparently that in the mathematics community they 
have been largely overlooked so far. 

For mathematics students looking for a friendly survey of filter banks, i.e., the 
subdivision method for frequency subband filtering (in an engineering treatment), 
we recommend Vaidyanathan’s book [Vai93] (a classic!); and for an excellent review 
of optimal subband and transform coders, see [VaAk01]. Many uses of filter banks 
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and the associated algorithms in wavelets (pure and applied) are covered in [Mal98], 


[VeKo95], and [StNg96]. 
The next chapter continues the same theme, but from a more geometric viewpoint 


than is traditional. 


Bo 


Pyramids and operators 


Mathematics is an experimental science, and definitions do not come first, 
but later on. —Oliver Heaviside 


PREREQUISITES: Operators and their adjoints; Hilbert space; product measure; 
set-theoretic partitions; multiindices; a curiosity about some applications outside of 
mathematics. 


Prelude 


In Chapters 4 and 7, we stressed that the crucial feature of localization is shared by 
a number of basis constructions, most notably by those of wavelets and of certain 
classes of fractals. This includes basis constructions in Hilbert spaces built recur- 
sively on fractals and on state spaces in dynamics. The recursive approach to the 
more general basis constructions is a special case of a refined tool from probability 
which is based on martingales. (It should be contrasted to classical Fourier expan- 
sions, which are notoriously poorly localized.) 

The localization can be further refined with the use of pyramid algorithms (see, 
e.g., the multipart images in Figures 7.4 and 7.5, pp. 118-121), and we will follow 
up on this in the present chapter. What should emerge is that our constructions based 
on scale-similarity have an intrinsic recursive nature which immediately suggests 
numerical implementations of efficient and iterative matrix algorithms. 

We further saw that the use of iteration and scale similarity for bases in function 
spaces is intrinsic and closely tied to localization. This in turn is part of active and 
current research, and has been made precise in a number of brand new research 
papers which go far beyond the scope of our book. Suffice it to call attention to 
work by Stephane Jaffard (e.g., [Jaf05]) on oscillation spaces, including the so-called 
Besov spaces (also beyond the scope of our book). 
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In connection with localization, we mention at the conclusion of Chapter 5 that 
there are further implications for approximation of functions. In particular we cited 
papers by Terence Tao (for wavelets) and R.S. Strichartz (for Hilbert spaces on frac- 
tals) which show that localized basis expansions in function spaces yield much better 
pointwise approximations than is possible for traditional Fourier bases; so contrast- 
ing again to classical Fourier expansions. In addition, Chapter 7 above emphasizes 
that localization is crucial for computation! 

In this chapter we shall take up an additional and separate issue related to local- 
ization, that of separation of variables. Our approach to this will be geometric, and 
as far as possible, it will be based on the notion of tensor products in Hilbert space, 
including infinite tensor products. Readers not familiar with Hilbert-space concepts 
are encouraged to first do the block of multipart exercises in Chapter 7 (starting 
with Exercise 7.9) which are entirely devoted to the kind of tensor products that are 
needed. 


8.1 Why pyramids 


In this chapter, we follow up on and generalize the ideas which form the basis for 
the pyramid algorithm (7.5.5) of dyadic wavelet packets. To see how this ties into 
more general models, based on branching, we begin with a closer examination of the 
unitary operator U2: L? (R) > L? (R) given by 


Uf@:=/2fQ2r), fel?R,teR. (8.1.1) 
In particular, we show that there are two representations, (P;) and (S;), of the Cuntz 
algebra O2, see (7.6.1), such that 
1 
U. => P @S. (8.1.2) 
i=0 
We will also study more general pairs of representations (P;) and (S;) of the Cuntz 
algebra Ow such that the factorization 


N-1 
U=) Fes; (8.1.3) 
i=0 


has attractive spectral properties. 
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Fig. 8.1. The fundamental dyadic pyramid. 


8.2 Dyadic wavelet packets 


The simplest instance of the framework in Section 2.1, especially (2.1.2), is the fol- 
lowing: Let X = No = {0, 1, 2, ...}, and set 


to (n) = 2n, tT, (nm) =2n+1, née No. (8.2.1) 

Defining 
o (2n+i):=n, i=0,1, ne No, (8.2.2) 
we see that (2.1.2) is satisfied. Moreover the sequence go, 1, 92, ... of wavelet 


packet functions from (7.5.5) in Proposition 7.5.2 is generated by the pyramid in 
Fig. 8.1 below. 
We say that the pyramid in Fig. 8.1 is singly generated. 


Definition 8.2.1. Let _X, 0, 1;,i =0,..., NM —1, bea branching system as defined in 
(2.1.2). We say that the system has d generators if there is a subset E C X satisfying 


#E=-d (8.2.3) 


160 8 Pyramids and operators 


and 
Jo" (£) =X, (8.2.4) 
where nN 
o "(bE)={xeX|o"(x)eEE}. (8.2.5) 


We now introduce the following modification in Example 2.2.1 from Chapter 2. 
We set 


| To (n) = 2n, t, (n) = 2n +3, neéNo, (8.2.6) 


o (2n+3i) =n, neNo, i =0, 1. 
Setting Y = Z, we see that condition (2.1.2) is satisfied, so the system (Z, o, 1;) 


is a dyadic branching system. Moreover the following diagram in Fig. 8.2 and an 
induction show that the subset 


E := {-3,-2, -1, 0} (8.2.7) 


satisfies the condition in Definition 8.2.1. In contrast, (8.2.2) is singly generated. 
Another feature which separates the system (8.2.6), with step size 3, is that it 


contains a non-trivial two-cycle, i.e., C = {—2, —1}. Note that o (—2) = —1, and 
o (-1) = —2. In addition, it has two distinct one-cycles, i.e., o (—3) = —3, and 
a (0) = 0. 


Lemma 8.2.2. Let N € N, N > 2, and let o: X > X be an N-to-\ mapping which 
is onto X. Pick branches of the inverse, 


wX—3X, i=0,1,...,N-1, (8.2.8) 


such that (2.1.2) holds. 
On €? (X), let \x), for x € X, be the canonical basis vector, and set 


P; |x) := |t7 &)), i=0,1,...,N—1. (8.2.9) 
The adjoint operator P** is 
P* |x) = Xx) (x) lo (x)), i=0,1,...,N—1. (8.2.10) 
Then this system of operators satisfies the Cuntz relations 
PPP; =6,j ley, 


N=] (8.2.11) 


>s P; P;* = Be2(x)- 
i=0 


Proof. The formulas (8.2.11) follow directly by an application of the expressions 
(8.2.9) and (8.2.10) for the operators P; and their adjoints P*. The computation of 
the adjoint operator to P; in (8.2.9) is straightforward. Oo 
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18 21 


Lemma 8.2.3. (a) Let (SN rs be a representation of the Cuntz algebra On on the 


Hilbert space K, and set 
N-1 


U= > BOS. 
i=0 


Then U: €? (X) @ K > €? (X) @K is a unitary operator. 


(8.2.12) 


(b) Conversely, if (S;) is a system of operators in K such that U in (8.2.12) is 


unitary, then the operators S; satisfy the Cuntz relations 


S*8; =d,j1K, 
N-1 
>, SiS? = Ik. 
i=0 


Proof, We will prove (a), but leave (b) to the reader. 
First, 


UU" 


> D> PP? OSS; 


ij 
wy) if; @A,j1kK 


= 1 z 
by (82.11) “eK 


and secondly, 


U*U 


DU Da Fi Py @ 55; 
io 


A i 7 _ OF 
by (8.2.11) 2 das Lacy) @ SS; 


= 1 . 
by (82.13) “Wek 


(8.2.13) 


162 8 Pyramids and operators 


Lemma 8.2.4. Let (9n)ncn, be the sequence in L? (R) from Proposition 7.5.2, and 
let K be the Hilbert space L* (T) with basis 


ej (Z):= zJ zeT, j eZ. (8.2.14) 


Setting 

W (\n) ® e;) :=On(t— J), (8.2.15) 
this assignment extends to define a co-isometry of £7 (No) @K into L? (IR) with range 
equal to a dense subspace in L? (IR). This co-isometry will also be denoted W. 


Proof. The conclusion (7.5.8) from Proposition 7.5.2 is the assertion that the double- 
indexed family 
{gon(. —j)|nENo, j eZ} (8.2.16) 


forms a tight frame (also called a PARSEVAL frame) for the Hilbert space L? (R). It 
follows from this that W in (8.2.15) maps onto a dense subspace in L? (IR); and we 
need only check that it is co-isometric. 

For f € L? (R), we have the following computation: 


VAP pay Dll @es VTP 
- > |r (im @e,) IFIP 


= .-j 2 
Rist 2, lon DIF)| 


2 
= : Oo 
ak IF lr 2 ay 


Remark 8.2.5. Let W be the operator from (8.2.15) in the lemma. Then W is a uni- 
tary isomorphism of £2 (No)@L? (T) onto L? (R) if and only if the family gp (- — /) 
in (8.2.16) forms an orthonormal basis in L? (R). 


In our next result, we show that the familiar dyadic scaling operator U2 of (8.1.1) 
has a tensor representation with respect to the two representations of the Cuntz alge- 
bra ©, from above. 


Theorem 8.2.6. Let P;, i = 0, 1, be the representation of Oz which is described in 
Lemma 8.2.2 and (8.2.1){8.2.2), and let S;, i = 0,1, be one of the representations 
of Oz described in Lemma 7.6.1. Let 


W: €7 (No) @ L? (T) > L? (R) (8.2.17) 
be the co-isometry of Lemma 8.2.4. 
Then 
W > P: @ Sf = UW, (8.2.18) 
i=0 


where Uy is the dyadic scaling operator (8.1.1) in L? (R). 
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Proof. First 
1 1 
W >. Pi @ S} (In) @e;) = n{ Sen 2 s)) (8.2.19) 
i=0 ix0 
1 
= Da (ex | Ste; )Gan4i t -—/) 
i=0 keZ 
I pas 
= V2a" 4 P2n+4i (t —k), 
i=0 keZ 
where 
a =a, and af) =(-1Faj~ fork eZ; (8.2.20) 


see also (7.5.1)(7.5.2). Substituting this back into (8.2.19), we get 


1 
n(x P, @ S} (|n) e«))= /2. On Qt —f), (8.2.21) 


i=0 
which is the desired conclusion (8.2.18). a 


Corollary 8.2.7. Let the two representations of Oz be as described in the theorem, 
and assume in addition that the functions in (8.2.16) form an orthonormal basis 
(ONB) in L? (R). 

Then the unitary scaling operator Uz in L* (R) is unitarily equivalent with 
Theo Pi @ St. 


Proof. Immediate from the theorem, and Remark 8.2.5. Qo 


Proposition 8.2.8. (a) Let the two representations of Oz be as described in Theorem 
8.2.6, and assume the functions in (8.2.16) are orthonormal. 
Then 


1 
(9m (+ —k) | U2gn (> — JY) = Do bm,anti { Siek | ey) 


foe forallm,neNo. (8.2.22) 
(b) Leta € L™ (T) with Fourier series 
a@)= > az, zeT, (8.2.23) 
keZ 
and set 
a* n= > aK0n(- —k). (8.2.24) 
keZ 
Then 
(2% Om | Urb * On) = > dm,2n4i (Sia |b). (8.2.25) 


i=0 
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Proof. \t follows from sesquilinearity that the second conclusion is implied by the 
first. To verify (8.2.22), we carry out a computation and use the results in Section 
75. 

Before starting the computation of the left-hand side in (8.2.22), write m € No 
in the form m = p+2I, p © {0,1}, and/ e No. (This is the familiar Euclidean 
algorithm in its simplest form!) We now calculate the term for m = p + 21/; but since 
p runs over {0, 1}, the result must be stated as a sum, as in (8.2.22). 

We have 


(Qm (- —k) | U2Gn (- —J)) 
= [en C-B 2m Qt 1) dt 


————- 1 x x 
(by Planchecsl) fa (x) ve (x) V2 vA (5) °j (5) ae 
= V2 f ek Oey Gm Oe) bn Ce) ey (x) ax 
R 


V2 i. G4 (Ox) mp (&) dy 2) On (x) ey (x) de 


= af e; (2x) Mp (x) “Ae +r) Gn (x +1) e; (x) dx 


reZ 


an x (2X) Mp (X) dine; (x) dx 


(by orthonormal ity) 


1 
a3 [Rs DL mp Ges 0) de 


es) 2y=x mod 1 


-_ OLn (ex | Shei } 


1 
= Oin { Spex | ej) = > bm,2n+i (Siex ley), 


i=0 
which is the desired result. Recall that 
1 
Ol,n = > Sin 2n4i> (8.2.26) 
and only one term on the right-hand side in (8.2.26) is non-zero. Oo 


Corollary 8.2.9. ([CoWi93], [Wic93]) Let the two representations of Oz be as de- 
scribed in Theorem 8.2.6, and let yo, 91, ... be the functions generated by the algo- 
rithm (7.5.7). Consider a subset 


ACNpo x No. (8.2.27) 


Then the following two conditions are equivalent: 
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(i) the segments 


[2?n, 2? (n+ 1)) C No, (p,n)€ A, (8.2.28) 
form a non-overlapping partition (or tiling) of No; 
and 
(ii) 
{UPon(- -/)Dl@neA, j eZ} (8.2.29) 
is an ONB in L? (R). 


Proof. We give a proof which relies on Theorem 8.2.6 above. Having the represen- 
tation 


1 
U.= > 2 8S (8.2.30) 
i=0 
makes it clear that 
Uy = > POS; (8.2.31) 
I 


(where the summation is over all multiindices J of length p). 
Using elementary properties of tensor product in Hilbert space, we similarly get 


(Up)" =U;? => P7 @S;. (8.2.32) 
I 


Recall, the formula for P? is given in (8.2.10). With the multiindex notation 
T= (ips igs hex tp)'s i, €{0,1}, l<j<p, 


it is then easy to verify the formula (8.2.32) for (U?)*. The reader only has to insert 
the expressions for the multiindexed operator monomials. Specifically, we have 


Py = Pi, --+ Pi, 
Sp = Si, +++ Sips (8.2.33) 
St = Sp. StSt. 


We further refer to the formulas (7.6.18)(7.6.19) for the explicit computation of 
the multioperators S;, and adjoints S7. 
With this, it is clear from Proposition 8.2.8 that 


(9m (+ —k) | UP Gn (- —D)= [om (t — k)2?/*@, (2Pt — j) dt 
= D> Sn2Pmbiyte tip P| (Srex | ej). 
I 


The two transformation rules follow immediately from this. We have 
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US 9n(- — jf) = > (Srex |e; )@m(- —k). (8.2.34) 
(i1,..,ép), m, 
2Pntijt--+ip2?—!=m 


Moreover, when A is chosen, we get the converse basis transformation: 


9m (2 —k)= a (e; | Srer) UP Gn (- — J). (8.2.35) 
(f1,--ép)s n, 
2Pntij+--+ip2P-!=m 
The equivalence of (i) and (ii) is immediate from this. oO 


The next section gives more details on the choice of the sets A which make the 
segments [2? n, 2? (n + 1)) tile No. We will show that the measures from Section 7.6 
allow us to make “good” selections of sets .A C No x No and therefore “good” ONBs. 


8.3 Measures and decompositions 


In this section we revisit the measures from Section 7.6. Indeed these measures were 
introduced in [CoWi93] and [CoMW95] for the purpose of choosing. between the 
families of orthonormal bases (ONBs) which are listed in (8.2.28) from Corollary 
8.2.9. The measures help in the use of entropy methods in the selection of the “best” 
ONBs. 

In (8.2.28) from Corollary 8.2.9, we consider segments in No, and when two 


segments [2?n, 2? (n + 1)) and [2-0, 2P' (n’ + 1)) are disjoint then the corre- 


sponding basis functions UF Qn (- — j) and Us On’ (.-J ) are orthogonal, and vice 
versa. This follows from the two transformation rules (8.2.34)(8.2.35) which make 
it clear that a segment [2?n, 2? (n + 1)) consists precisely of the numbers m € No 
which admit a dyadic representation 


m =2Pn+ij +++-+ip2?| = 2Pn+x. (8.3.1) 


Working in R,’Z = [0, 1), we see from (8.3.1) that the partitions from (1) in Corol- 
lary 8.2.9 correspond to dyadic partitions of the unit interval. The correspondence 
between (a) the “p-subintervals,” or segments of the integers No, and (b) the dyadic 
fractional intervals in [0, 1) is 


: — = y 
farm, 27 (n+) > [eae Bt tbs), (8.3.2) 


2? 2? oP a op 


where the integer x in (8.3.1) is specified by the multiindex /, and the Euclidean 
algorithm, i.e., 


X= tin2+---+ip2? "|, — withi, € {0,1}. (8.3.3) 
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In the sequel, we will use this 1-1 correspondence m (natural number) <9 J (i1,...). 
If m is given, and i, is the last non-zero term in (8.3.3), we say that p is the length 
of m, or of J. 

Comparing with (8.3.1), the length is the smallest p for which there is a represen- 
tation (8.3.1) with m = 0. The number p in (8.3.2) is this length, i.e., p := length (/). 
As aresult, p is determined by m in (8.3.2), and 2~? is then the length of the corre- 
sponding dyadic subinterval in [0, 1) on the right-hand side in (8.3.2). 

If m is as in (8.3.1), then 2~?m =n+i,;2°? +-- tip27! =i;27-P+-.-. -+ip27! 
modulo Z; and indeed i;27? +---+i aot is the left endpoint in the subinterval on 
the right-hand side in the formula (8.3.2). 

Left endpoints to left endpoints, and the same for rights! So 2?n should corre- 
spond to ij2~? +--+ ip27!, and 2? (n + 1) to ij}2-? +--- +ip27! +27. Now 


2P (427? ae + ip2!) =i] eee ip2P7! =x, 
and 
2P (1:27? +--- 44,271 +2°?) Sei 


So when / varies over the p-tuples (i1,..., ip), then x covers [0, 2); and when n 
is chosen as in (8.3.1), then the integers m = 2?n + x cover [2?n, 2? (n+1)). 

Hence, a specific multiindex J = (i los. -58 p) “selects” a unique subinterval in 
[0, 1) of length 2”, and with endpoints which are dyadic rationals. 


Definition 8.3.1. If (S19 is a representation of OQ» in L? (T), and 
S7 = Si, --- Sip; 
then 
P (1) := S1S7 
is the orthogonal projection of L? (T) onto the subspace S; L? (T). 
We saw in Section 7.6 (Chapter 7 above) that the correspondence 


Im P(D (8.3.4) 


extends naturally to a projection-valued measure defined on the Borel subsets B of 
[0, 1); see also [Jor05] and [Jor04b] for additional details. 

Specifically, this measure extension (also denoted P (-)) will be o-additive, and 
it is orthogonal. This means that the identity 


P (By M By) = P (Bi) P (B2) (8.3.5) 


holds for all B,, Bz € B. 
As a result, every h € L?(T), A? = 1, induces a probability measure up, 
according to the formula 


Bn (B):=||P(B)AI?, = for Be B. (8.3.6) 
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Proposition 8.3.2. Let (Sho be some representation of Oz on L? (T) which cor- 
responds to a quadrature-mirror filter (mo, m,) as described in Lemma 7.6.1. Let 
p,né No, and j € Z. 

Then the measure fe, (-) on [0, 1) prescribes the distribution of the wavepacket 
coefficients of the function 


(UP on) (t — j) = 2?! gp (2?t — J) (8.3.7) 


relative to the ONB 
{Qm(- —k)|meNo, keZ}. (8.3.8) 


Proof. The result follows from (8.2.34) in Corollary 8.2.9 above. This is the formula 
which gives the expansion of UE: Gn (- — /) in the ONB (8.3.8). 

Hence, to prove the proposition, we only need to compute the £?-norm of the 
expansion coefficients from (8.2.34). The computation is just a simple application of 
Parseval’s formula to the standard Fourier basis 


ex (Z) = zk fork € Z,andz e€ T. 


Indeed, summing the expansion coefficients over k, we get 
Sliseete)? = Wlleet stey)P 
keZ keZ 


— %* = 2 
(by Parseval) [sres| 


(e; | SrSte;) 
(e; | P(e;) 
Pe, 


= He, (the dyadic interval labeled by /) , 


(see (834) 


which is the desired conclusion. Qo 


8.4 Multiresolutions and tensor products 


In this section we show that the tensor factorization of the unitary scaling operator 
U from Section 8.2 gives rise to a general class of multiresolutions. Multiresolu- 
tions were introduced into wavelet theory by Mallat [Mal89], and they are based on 
ideas of Kolmogorov [Kol77] from dynamics, and on the notion of a martingale from 
probability theory [Wil91], as we noticed in Chapters 2 and 3 above. 

In the present context, we shall take “multiresolution” to be defined as follows. 
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Definition 8.4.1. Let U be a unitary operator in a Hilbert space H, and let Ho be 
a closed subspace in H. We say that 7/9 generates a multiresolution for U if the 
following two conditions (i}-(i1) hold (if all three conditions (i)}-{iii) hold, we say 
that the multiresolution is pure): 


(i) Ho C UH, 
(ii) \J U"Ho =H, 
neZ, 
(iii) A\ U" Ho = {0}, 
neZ 
where the symbols \/ and /(\ denote the Jattice operations from Hilbert space, i.e., 
V) applied to a family of closed subspaces in H means “the closed linear span”, and 
/\ means “intersection”. 


Remark 8.4.2. Let a system (U, H, Ho) be given as in Definition 8.4.1, but suppose 
only (i){ii) are satisfied. Then U and H may be modified such that the resulting 
reduced system (U’, H’, Hp) satisfies all three conditions (i)+{iii). This follows from 
a simple application of the Wold decomposition; see [BrJo02b]. 

We now sketch the details of proof: Suppose only (i)-(ii) hold. Then set 


K:= [\ U"H, (8.4.1) 
neZ 
H :=HEK={feH|(f|k) =0forallk eK}, (8.4.2) 
and 
Hy = Ho OK. 


We leave to the reader the verification of the assertions; i.e., that the reduced 
system Hy > Ho, H > H’, and U > U’, U’ := U|y), satisfies all three conditions 
(i)H{iii). Oo 

The significance of Definition 8.4.1 is reflected in the following general result on 
unitary equivalence. 


Theorem 8.4.3. Let (U, H, Ho) and (U’, H’, Hj) be two given systems as stated in 
Definition 8.4.1, and assume that all three conditions (i){iii) hold for both systems, 
ie., that the two multiresolutions are pure. 

Then there is a unitary isomorphism 


WHO H (8.4.3) 
which satisfies the following two conditions: 
WHo = Ho (8.4.4) 
and 
WU =U'W (intertwining). (8.4.5) 


(We say that the systems are unitarily equivalent. ) 
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Proof. This is a standard result in operator theory, and we refer the reader to 
[BrJo02b, Chapter 2]. We further add that the underlying ideas date back to Kol- 
mogorov [Kol77] in the 1930s. oO 


We now show that the tensor factorization U = >°, P; ® S} from Section 8.2 
above gives rise to multiresolutions. 


Theorem 8.4.4. Let N € N, N > 2, and let (X,0, 1;) satisfy the conditions in 
Lemma 8.2.2. Let (P; ya Pe be the corresponding representation of Oy in €? (X). Let 
(S; ye my be a representation of On in a Hilbert space K, and set 


H := €7(X) @K, (8.4.6) 
N-1 

U:= > Pi @S}. (8.4.7) 
i=0 


Let a subset E C X satisfy (a) and (b) below, 


(a) o(E)=E 
and 
6) Uo" =X, 
néeNo 
and set 
Hn = * (a (E)) @K. (8.4.8) 
Then 
U"Ho = Hn (8.4.9) 
and 
(U, H, Ho) is a multiresolution system, (8.4.10) 
Le., 


Ho := @ (LE) @K 
generates a multiresolution for U. 


Proof. To prove that (i) in Definition 8.4.1 is satisfied note that for every subset 
Ec X, we have 
Eco '(¢(£)). (8.4.11) 


Substituting (a) from the theorem, we get 


N-1 
Eco'(£E)=|Ju®. (8.4.12) 
i=0 


But ife « E andk € K, then 
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a ae ices (8.4.13) 
= u( Pi |e) ® si) 
= U(X za00 (e) lo (e)) ® sk) 


The desired conclusion Ho C UHo follows. 
If n € No, then 


U" (le) @K) = DL |r tin tHe) @ Sf ++ Sik. (8.4.14) 


T],--2p 


Now, combining (8.4.14) and (8.4.12), we conclude that 
U” Ho Cc Hn. 


However, the argument from (8.4.13) shows that this inclusion is in fact an identity. 
The remaining property (ii) from Definition 8.4.1 clearly follows from this, and 
(b) in the theorem. Oo 


Remark 8.4.5. In applications of Theorem 8.4.4, it is useful to reduce general mul- 
tiresolutions to a system of minimal ones. If N and (X, o, 1;) are as stated in Theorem 
8.4.4, it is helpful to make a careful choice of the sets E to be used in (a) and (b) of 
the theorem. 

If E c X satisfies o (E) = E, we say that E is minimal if the following holds: 


ehineny 


ened Jaane. (8.4.15) 


In our example (8.2.6) above, we had E = {—3, —2, —1, 0} satisfy o (E) = E, 
but each of the sets 


E; = {0}, 

E2 = {-2, —1} ’ 
and 

E3 = {-3} 


satisfies o (E;) = E;, and each £; is minimal. 


Remark 8.4.6. The next result shows that such minimal sets E; may be chosen in 
general. However, it is only for rather special branching systems CX, o, 1;) that the 
choices are natural. 
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Lemma 8.4.7. Let N € N, N > 2, and let (X, o, 7;) satisfy the conditions in Lemma 
8.2.2. Suppose some subset E C X satisfies conditions (a) and (b) in Theorem 8.4.4. 
Then there are subsets E; C E such that 


o (E;) = Ej, (8.4.16) 
En E; = fori F# j, (8.4.17) 
LJ E; = E, (8.4.18) 
i 
and 
each E; is a minimal solution to (8.4.16). (8.4.19) 
Proof. Zorn’s lemma. Oo 


Definition 8.4.8. If A is a subset of X, we set 


A:=|(Jo"(A). (8.4.20) 


néeNo 
Proposition 8.4.9. Let N — N, N > 2, and let (X, 0, 1) satisfy the conditions 


in Lemma 8.2.2. Let (E,(Ej)je1) be a system of subsets in X which satisfies the 
conditions in Lemma 8.4.7, where I is a chosen index set. 


Then - 
E;\NE;=2 ~~ fori#j (8.4.21) 
and 
Um =x. (8.4.22) 
iel 
Proof. We leave the easy details to the reader. Oo 


Remark 8.4.10. The significance of conditions (8.4.21)-(8.4.22) in Proposition 
8.4.9 is that they generate mutually orthogonal multiresolutions in the Hilbert space 
€? (X) @ K. Specifically, we get an orthogonal decomposition 


P(N @K= De (Ei) @K (8.4.23) 
ie! 


which reduces the operators, and the representations on £7 (X)@K considered above. 

Returning to the three sets £1, £2, and £3 from Remark 8.4.5 (and Figure 8.2, 
p. 161, from (8.2.6)), we get Z written as a disjoint union of the associated three sets 
E\, E, and E3. The sets are equivalence classes for an equivalence relation studied 
in [BrJo99a]: 


E, = (0,3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, ...}, 
By = {425-1 4, 1)--8, 5, 2,5, -16, 13; 10 7,4, 7, 10, 13;,.:.} 
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rac rs 4 co ~ Z \ 13 


Fig. 8.3. The pyramid on Ey = {—2, —1} (may be seen as a subdiagram of Figure 8.2, p. 161). 


and 
E3 = {—3, —6, —12, —9, —24, —21, —18, —15,...}. 


The two sets E; and £3 are singly generated. The middle one can be understood 
from the pyramid shown in Figure 8.3. 


D oy; 
OD ® 


Exercises 


8.1. Let H be a complex Hilbert space. By a projection, say P, we mean a linear 
operator P:H — H which satisfies P = P? = P*. An operator S:H > H is 
called a partial isometry if S*S is a projection. Recall that the norm of a bounded 
linear operator T: 71 — XK is defined as 


7 || = sup {|TA]] | he H, Al] = 1}. 


(a) Suppose that an operator P satisfies P = P?. Show that P is a projection if 
and only if || P|| = 1. 

(b) Let S be an operator such that S*S is a projection. Then show that S'S* is also 
a projection. The two projections are called the initial, resp., the final projection if S 
is given to be a partial isometry. 

(c) Show that if S is a non-zero partial isometry, then ||.S|| = 1. 
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8.2. Let H be a Hilbert space, and let VN € N. A set So,..., Sy—1 of N operators is 
said to satisfy the Cuntz relations if (i) and (ii) hold: 


(i) S7.S; = 6;,;1 (where J denotes the identity operator), and 
N-1 

Gi) S° SS? = 1. 
i=0 


(a) Suppose it is given that the individual operators S; are isometries, and that (ii) 
holds. Then show that (i) is automatic. 

(b) Suppose that for some N, the Cuntz relations have a representation in 71. 
Then show that #1 must be infinite-dimensional. 


8.3. Let H = 2 (Z), and let N € N be fixed. Let E be a subset of Z such that there 
is a bijection between E and the cyclic group Zy = Z/NZ. Define 


Spek = Cr4Nks rekE,keZ. 
Then show that 
SS = Opp forallr,r’ € E, 
> SS = 1. 
reE 
8.4. Let H = L?(T), and let N € N be given. Let mo, m1,...,mn—1 € L©(T), 


where we give T the representation RZ. Set 
Sj fx) =VN m; (x) f(Nx) for fe L?(T),and0 <j <N. 


(We identify functions on T with 1-periodic functions on R.) 
a) Show that the operators (S;) . se yield a representation of the Cuntz relations 
J} j=0 
if and only if the matrix 


mo (x) mo (x + ¥) mo (x + Nt) 
m (x) m (x +4) mi (x + X=) 


my—\ (x) my-1 (x + t) ++ MN-} (x + Nyt) 
is unitary for a.e.x € R. 
(b) Show that the following three conditions are equivalent: 
(i) the functions (m eS satisfy the unitarity condition in (a), 


oe k 1 
(ii) Ds mj (: +5) mj; (: + x) = dx1, and 


j=0 
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N-1 TO? ; 
(iii) >», mk (: + ‘) mi (« + ‘) = Ok. 
é n n 
j=0 
8.5. Examples 
For convenience, make the identification T = [0, 1), H = L? (T) = L? (0, 1), so 
that Haar measure on T corresponds to the restricted Lebesgue measure on the unit 


interval. Functions on [0, 1) will be identified with 1-periodic functions on R via the 
obvious extension. Set e, (x) := e!?""™,neZ,x ER. 


(a) For N = 2, check that the following pairs of functions satisfy the unitarity 
conditions which are listed in (a) (and (b)) from Exercise 8.4. 


Example 1: mo (x) = 7j0,1/2) (x), mi (*) = x11/2,1) @). 


Example 2: mo (x) = aa mi (x)= a. 

i 
Example 3: mo (x) = Wei m (x)= am. 
Example 4: mo (x) = cos (xx), m, (x) = sin (zx). 
Example 5: mo (x) = 4a) m, (x) = 48) 
Example 6: mo (x) = +t) m (x) = 138). 


(b) For N = 3, check that the following systems of three functions satisfy the uni- 
tarity conditions which are listed in (a) (and (b)) in Exercise 8.4. 


Example 7: mo (x) = x{0,1/3) (%), m1 («) = X11/3,2/3) &), 
m2 (x) = x[2/3,1) (*). 


1 e] (x) 2 (x) 
Example 8: mo (x) = —, m, &) = —, m2 (x) = —— =. 
Pp 0 (x) 3 1 (&) Wa 2 (x) Ai 
Example 9: Set £3 := e! F = -5 + i, and 


mo (x) = 5 (I +e) +e2(2)), 
mi (x) = 5 (1+ Goer (2) + Ge (0), 


my (x) = : (1 + g3e1 (x) + G3e2 (x). 


L+e@)  ._ 1-ete) 


V6 > 1@)= WK > 


Example 10: mo (x) = 


e1 (x) 
a5: 


m2 (x) = 
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8.6. For the function mo in the examples in Exercise 8.5, consider the corresponding 
isometric operators in H = L? (0, 1) given by 


Sof) =V2mo(x) fx) — ifN =2, 


and 


Sof (x) = V3 mo (x) f Bx) if N =3. 


Show that the condition 
*n a 
jim ag = 0; fen, (8E.1) 


is satisfied in Examples 1, 4, 5, 6, 7, 9, and 10, but not in Example 2, 3, or 8. 


8.7. (a) For the representations of O2 in Examples 1, 4, 5, and 6, verify irreducibility, 
ie., that there is no Hilbert space K, 0 G kK g H, which reduces the action of Oz on 
H. 

(b) For the representations of O3 in Examples 7, 9, and 10, verify irreducibility 
on the Hilbert space H = L? (0, 1). 

(c) For the representations in Examples 2, 3, and 8, directly verify reducibility. 


8.8. Show that the respective representations of Oz and ©3 lift to wavelet bases (or 
normalized tight frames) for ZL? (R) in the cases of Examples 1, 4, 5, and 6 (the 
dyadic case), and for examples 7 and 9 (the triadic case). 


8.9. Show that the respective representations of O2 and ©3 do not lift to wavelet 
bases (or frames) in the Hilbert space L? (R) in the cases of Examples 2, 3, 8, or 10. 


8.10. Discuss the relevance of the representation of 03 from Example 10 (in Exer- 
cise 8.5) for the middle-third Cantor set X3, and its Hausdorff measure hs of Haus- 


dorff di ion s = log3;2 = —. 
orff dimension s = log; as 


References and remarks 


We were [initially] entirely in Heisenberg’s footsteps. He had the idea that 
one should take matrices, although he did not know that his dynamical quan- 
tities were matrices.... And when one had such a programme of formulat- 
ing everything in matrix language, it takes some effort to get rid of matrices. 
Though it seemed quite natural for me to represent perturbation theory in 
the algebraic way, this was not a particularly new way. —Max Born 


Wavelet packets were pioneered by Coifman, Meyer, and Wickerhauser; see espe- 
cially the seminal papers [CoMW95], [CoWi93], and [Wic93]. The idea and the 
motivation come from the equivalence of the two properties (i) and (ii) in Corollary 
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8.2.9. By a careful choice of the set A, one is able to adjust the orthonormal basis 
(8.2.29) in such a way that the frequency bands are adapted to the space (or time) 
localization. Since the functions in (8.2.29) have the form 


Po, (2?t > j) > 


it is clear that large choices of p yield narrow frequency bands and vice versa. 

Other helpful references on the subject may be found in Wickerhauser’s book 
[Wic94]; see also [CoMW95], [BeBe95], [Jor05], and [Jor04b]. 

We have outlined in this chapter a particular interplay between on the one hand, 
certain selection, or optimal choice problems, in wavelet theory, and on the other, 
the wider field which often goes under the label of “non-commutative probability 
theory.” While we are only scratching the surface of this very active research area, 
we want to mention two more directions of non-commutative probability which are 
not covered here: (1) Free probability, and (2) dilations of finite or infinite sets of 
non-commuting operators in Hilbert space. Readers who wish to learn more about 
(1) might wish to consult the monograph [VoDN92], [JoSW95], or [KoSp04]; and 
for (2) we recommend [Pop89], and [DaKS01]. 
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Pairs of representations of the Cuntz algebras O,,, 
and their application to multiresolutions 


Tf one finds a difficulty in a calculation which is otherwise quite convincing, 
one should not push the difficulty away; one should rather try to make it the 
centre of the whole thing. —Wenrner Heisenberg 


PREREQUISITES: Permutations; trace; tensor products of matrices; Cantor. 


Prelude 


This chapter will resume our study of the geometric approach via Hilbert space to 
separation of variables/tensor product stressed in the last two chapters; and we will 
note a number of applications of this idea. The approach further serves to clarify a 
number of themes involving combinatorics of the recursive bases studied throughout 
the book. The separate themes are as follows. 

(1) The tensor-product idea yields an explicit representation of the unitary oper- 
ator U which we use to model scale-similarity in our constructions both for standard 
wavelets and for fractals; see especially Lemma 9.3.2. 

(2) Using tensor products we show that there are two representations of the Cuntz 
relations involved in modeling basis constructions on scale-similarity (such as was 
pioneered first for the standard wavelet bases in L? (IR“)). As already noted, our use 
of multiresolutions naturally entails representation of the Cuntz relations; represen- 
tations which come directly from subband filters. But in addition, we find a second 
family of representations, one which reveals symmetry under certain permutations; 
see Section 9.3. The resulting second class of representations of the Cuntz relations 
has in fact already been studied earlier (for different purposes) under the name “per- 
mutative representations.” 

(3) The tensor-product idea further shows that the representation-theoretic ap- 
proach is as useful for new basis constructions in function spaces on fractals as it is 
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for the standard wavelet bases in L? (IR@). In Chapter 4 we introduced natural Hilbert 
spaces on affine fractals, and in Theorems 9.6.1 and 9.6.2 below we show how these 
spaces admit bases of generalized wavelets (i.e., “fractal wavelets”) and of wavelet 
packets. 

(4) Our use of tensor products helps organize the possibilities for the wavelet 
packets introduced in Chapter 7. In Theorem 9.4.3 we show that when a system of 
(admissible) digital filters is given, then the feasible choices of wavelet packets are 
in one-to-one correspondence with certain discrete pavings, or tilings. 

(5) Finally, in Section 9.5 we outline how our use of tensor products clarifies the 
discrete wavelet transform. Recall that a choice of a multiresolution (in the general- 
ized sense outlined in Chapter 7) automatically entails a discrete wavelet transform. 
When a resolution subspace is selected, this allows us to process data which is or- 
ganized in an associated sequence space (i.e., in an €-space); and it is precisely in 
these €2-sequence spaces where our Cuntz relations are realized. 


9.1 Factorization of unitary operators in Hilbert space 


In this chapter we show that the examples from Sections 8.2 to 8.4 in the previous 
chapter may be formulated and understood within an operator-theoretic framework. 
This abstract formulation helps to unify the examples, and to bring to light an un- 
derlying representation-theoretic flavor to the subject. At the same time, this general 
formulation may be of independent interest in operator theory. The examples from 
Chapter 8, i.e., wavelet packets and permutative representations, have appeared in 
various research papers in pure and applied mathematics; and it may now be an op- 
portune time to try to give them a general and more axiomatic formulation. 

This concluding chapter begins with a study of unitary operators U in Hilbert 
space. Motivated by the examples from Sections 8.2, 8.3, and 8.4, we consider the 
following general question: If U is a unitary operator in a Hilbert space 7H, when 
does it admit factorizations 

U=>)°S, @V;, (9.1.1) 
Tt 


where (S;) and (/;) are representations of the Cuntz algebras O,? We outline a 
number of results and applications regarding these factorizations. 

We show that fractals and wavelet constructions have attractive computational 
features. What the different examples have in common is a kind of scaling-similarity. 
The scaling will be represented by the unitary operator U. As we move up and down 
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a scale of powers of U, there will be intermediate bands corresponding to subdi- 
visions. One representation (S;) of O, serves to encode these bands, and a second 
representation is then used in recovering U from (S;). This leads to a representation 
(9.1.1) for the operator U. 


9.2 Generalized multiresolutions 


The relations known in operator theory as the Cuntz relations, i.e., S?S; = dj; 11 
and >°, S;S7 = i, and which define the Cuntz algebras O,, n = 2,3,..., also 
play a big role in signal processing and wavelet analysis. A common feature in these 
applications is a detailed spectral analysis of some unitary operator U:H — H 
which plays the role of scaling. This could be scaling between different resolutions 
in a sequence of closed subspaces of the underlying Hilbert space H, or the scaling 
could refer to a system of frequency bands. 

Two structures are present: scaling from one band to the next, and operations 
within each band. In this chapter, we show that these applications may be subsumed 
in a certain tensor factorization of H. 

In the simplest case, H = L? (IR) and U is the dyadic scaling operator 


Uf(t)=V2f Qt) for fe L*(R) andr eR. (9.2.1) 


In the case of a multiresolution analysis (MRA), there is a subspace Vy C H = 
L? (R) such that 


Vo c UW, (9.2.2) 
| U"Vo = {0}, (9.2.3) 
neZ 
and 
\V/ U™ =H, (9.2.4) 
neZ 


where /\ and \/ refer to the usual lattice operation on closed subspaces of 11. 
In the MRA case for wavelets, see [Dau92], there is a function gp € Vo such that 
Vo is the closed span of the translates 


{go(- -k/) |keZ}. (9.2.5) 


It is then possible (in favorable cases) to arrange that the translates in (9.2.5) are 
mutually orthogonal vectors of unit norm, i.e., that 


re go (t) po (t —k) dt = do. (9.2.6) 
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There is a pyramid algorithm (defined below) for generating a sequence 9, 9), ... 
with 
Qn € U"V (9.2.7) 


and such that 
{on(- —k) |n ENo, ke Z} (9.2.8) 


is an orthonormal basis (ONB) for L? (R), i-e., such that 
(Qn (- ~—k) | Pn' (- —K)) _ On,n! Ok, kt + 


We show that this structure arises from two representations of the Cuntz algebra O2, 
see [Cun77], (S;) and (V;), such that the unitary operator U of (9.2.1) has the form 


U=>°S @V;. (9.2.9) 
i 
The two representations are defined naturally from the wavelet data: the isometries 
S; act in £2 (No) and the V;’s in L? (T). In the simplest case, we have 
Sj |n) = |2n + i) for n € No andi = 0, 1, (9.2.10) 


where we used Dirac’s terminology for the natural basis \n) in €2 (No). The isome- 
tries V;, i = 0, 1, are defined by two functions m; on T = RZ by 


(Vif) &) = J2 m; (x) f 2x) for fe i (T), x Ee R, i =0,1. (2.11) 


The Cuntz relations for (9.2.11) are equivalent to the requirement that the matrix 
mo (x) mo (: + +) 


m(x) mj, (x ie 3) (9.2.12) 


is unitary a.e. x € T. 

It turns out that there are much more general systems of representations of the 
Cuntz algebras O, which nonetheless involve the same geometry. The applications 
include the theory of iterated function systems (IFS), symbolic dynamics, tilings, 
harmonic analysis, and fractals. 


9.3 Permutation of bases in Hilbert space 


Definition 9.3.1. Let 1 be a complex Hilbert space. We say that a finite set of isome- 
tries S;;H —- H,i = 1,...,n, forms a representation of the Cuntz algebra O,, if the 
following system of identities holds: 


n 
(a) S7S;=6,;1 and (6) >) SS? =1, (9.3.1) 


i=] 


where 1 denotes the identity operator. 
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It is well known, see, e.g., [Cun77] and [BrJP96], that there is a one-to-one cor- 
respondence between representations of the C*-algebra O,,, defined by the relations 
(9.3.1), and the set of all systems of isometries (S;) subject to (9.3.1). Moreover, 
Cuntz [Cun77] showed that the C*-algebra O,, is simple. 

As a result, we will identify the set of all representations of O,, on some Hilbert 
space H with the systems (9.3.1): A system (9.3.1) will be said to be an element 
in Rep (O,, 71). It is known that the set of equivalence classes of irreducible rep- 
resentations of O, is “too large” to admit a meaningful classification, see [Gli60]; 
specifically, this set does not have a Borel cross section. 


Lemma 9.3.2. Let H;, i = 1,2, be complex Hilbert spaces, and letn € N, n > 2, 
be given. Let (S;) € Rep (Oy, 1), and let V\,..., V_ be a system of operators in 
the Hilbert space H2. Then the following two conditions are equivalent: 


(i) the operator 


n 
U:=>°8 @V; (9.3.2) 


i=1 


is unitary in‘ H := Hy, @ Ho, 
and 
(ii) Vi € Rep (On, Hz). 


Proof. Consider arbitrary pairs of vectors x;, y; € 74;, i = 1,2. Using the relations 
(9.3.1) for the S-system, we find that the unitarity property (i) is equivalent to the 
following two identities: 


(x1 | 1) (x2 | OM Kitye) = (x1 [91 )(x2 |>2) (9.3.3) 


and 
YD (a | 87) | Vox) =( [lx |on) 03.4) 
ij 


Setting e (i, 7) := S; Si, we see that (9.3.1) is a restatement of the familiar matrix 
identities 
e(i, j)" =eG,1) (9.3.5) 
and 
e (i, jye (k,l) = 6; eG) (9.3.6) 
for all i, 7,k,7 € {1,..., n}. 
If we form the n-by-n block matrix 


M(V)= (Vv; Vi) 33 € My, (C) ® B (H2) = My (B (H2)), (9.3.7) 


the two identities (9.3.3}{(9.3.4) simply state that /(V) is the identity matrix in 
M, (B (H2)), ie., that 
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ViVi = 61,7 IH. (9.3.8) 


Let trace be the normalized trace on M,, (C), and consider 
trace ® id: M, (C) @ B (H2) > B(H2). 


Applying this to the element (% v7) a in M, (C) ® B (Hz), we get >, ViV;*. 


n 
ij= 


But identity (9.3.2) also yields (trace @ id) (i v7) = Iy,. Asa result, we get 
2 ViVi" = Wn. (9.3.9) 
i 


Combining (9.3.8) and (9.3.9), we get the implication (i) = (ii). Since the converse 
implication is immediate, the proof is completed. Oo 


Definition 9.3.3. Let X be a set, and let R:X — X be an endomorphism. Let 
R-!(x) := {ye X| R(y) =x} for x © X. We say that R is an n-fold branch 
mapping if R is onto and if 


#R1(x)=n  forallx eX. (9.3.10) 


For a given n-fold branch mapping, we shall select m branches of the inverse, i.e., 
n distinct mappings 
oi:X > X, i=1,...,n, (9.3.11) 


such that 
Roo; =idy for alli = 1,..., 7. (9.3.12) 


Definition 9.3.4. Let 7 be a complex Hilbert space, and let (S;) € Rep (On, H) for 
some n € N,n > 2. We say that (S;) is a permutative representation, see [BrJo99al], 
is each isometry 5; permutes the elements in some orthonormal basis (ONB) for H. 


Lemma 9.3.5. [BrJo99a] Up to unitary equivalence of representations every per- 
mutative representation (S;) of On in a Hilbert space H has the following form for 
some set X and some n-fold branch mapping R: X — X. 

Let H = €? (X), and for x € X, let |x) be the corresponding basis vector in H. 
Ifc: X — C is in € (X), then 


c= Dic@)|x), lel? = dle, (9.3.13) 
xeX xEeX 
and 
(x |y)=oy forallx,y EX. (9.3.14) 
For a choice of distinct branches o;, i = 1,...,n, set 


S; |x) := |o; (x)). (9.3.15) 
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Proof. Let (S;) € Rep (O,,, H) be a permutative representation. Let X be an index 
set for some ONB which is permuted by the respective isometries S;. As a result, 
there are maps o;: X — X such that 


Si |x) = |o; @)) fori =1,...,nandx eX. (9.3.16) 
Using (a) in (9.3.1), we consider that each a; is one-to-one, and that 
0; (X)N 90; (X= 2B ifi Aj. (9.3.17) 
Using (9.3.14) and (9.3.16), we derive the formula 
S? |oj («)) = 6; |x) fori, 7 =1,...,nandx € X, (9.3.18) 


for the adjoint operators S*, i = 1,...,. When (9.3.18) is substituted into (b) of 
(9.3.1) we conclude that 
oj (XY) =X. (9.3.19) 
i=l 
As a result, we may define R: X¥ — X by setting 


n 


R(o; *)) =x fori=1,...,nandx € X, (9.3.20) 


and conclude that R, defined this way, is an n-fold branch mapping. Substituting 
back into (9.3.18), we conclude that 


S? |x) = Xo,0% |R &)) fori =1,...,nandx eX. | 


9.4 Tilings 


Let X be a set, and let n € N, n > 2, be given. Let R: ¥ — X be an n-fold branch 
mapping. For x € X and p € N, set 


RP (x) :={yeX| RPC) =x}. (9.4.1) 


It is convenient to include the case p = 0, and set 


RP (x) ifp €N, 
E (x, p) = ‘ (9.4.2) 
{x} (the singleton) if p = 0. 
Definition 9.4.1. A subset A C X x No is said to define a tiling of X if 
LJ £G,p) =X (9.4.3) 
(x,p)eA 
and 
E@,p)NE(x' p)=S if, p)# (x, p’) ind, (9.4.4) 


i.e., the sets E (x, p), indexed by distinct points in A, are disjoint. 
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Examples 9.4.2. In the next three examples we illustrate the use of tensor products 
as outlined in Lemma 9.3.2. The scaling will be represented by the unitary operator 
U. One representation (S;) of O, serves to encode subdivision bands, and a second 
representation is then used in recovering U from (S;). This leads to a representation 
(9.3.2) for the operator U. The first representation (S;) of O, will be a “permutative 
representation,” i.e., it will be defined from a permutation of a canonical basis in one 
of the two tensor factors. Formula (9.2.10) is an example of such a representation, 
but there are many more; see for example the memoir [BrJo99a]. The second rep- 
resentation (V;) will be given by a quadrature-mirror filter as in (9.2.12), or more 
generally filters corresponding to 7 subbands as given in Chapter 7. The reader is 
encouraged to work out the two types of representations in Exercises 8.3 and 8.4. 

The additional feature in this construction is tensor products of Hilbert spaces. 
For this part the use of operator theory helps clarify the use of tensor products and 
setting up a “wavelet transform.” We are further relying on Dirac’s elegant bra-ket 
notation: see the conclusion of Chapter 2 for details. 


Example 9.4.2.1. Let X = No, and set 
RQn)=n and RQn4+1l)=n forn € No. (9.4.5) 
Then for (n, p) € X x No, we have the identity 
E (n, p) = [n2?, (n+ 1)2?). (9.4.6) 


One easily checks that the mapping RF in (9.4.5) is a 2-fold branch mapping with 
branches 
oy (n)=2n, o1(n)=2n+1 for n € No. (9.4.7) 


If] € Q(p), ie, J = (i1,..., ip), then 
1 (n) = iy + i22 +++» + ip2P| +n2?, (9.4.8) 


and it follows that the sets E (n, p) are as described in (9.4.6). In conclusion, the 
possible tilings of XY = No associated with R in (9.4.5) are the partitions of No into 
non-overlapping segments of the form (9.4.6). Here are three distinct types of such 
tilings. 
Case (a): A={(n,0)| ne No}, and 
E (n, 0) = {n} (the singleton) for n € No. 
Case (b): A={(0,0),, p)| p € No}, and 
E (1, p) =[2?,2?*') for p € No. 
Case(c): A={(0,2), (1, 2k), (2, 24), (3,24) | &k € N}, where now 
E (0, 2) = {0, 1, 2, 3}, and 
E (j, 2k) =[j2*,(7 +1)2%*) for j = 1,2,3 andk EN. 


We stress, however, that there are many more types of examples. 
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Example 9.4.2.2. It would be tempting to define R on No by the following rules, 
RQn)=n and R(Qn+3)=n, (9.4.9) 


by analogy to (9.4.5) in Example 9.4.2.1, but the equation 1 = 2 + 3 does not have 
solutions in No. One checks that no subset XY of No allows the rules (9.4.9) in the 
definition of a 2-fold branch mapping. Nonetheless, if we take XY = Z, then (9.4.9) 
does define a 2-fold branch mapping. 

So there are several differences betwen the two examples 9.4.2.1 and 9.4.2.2. For 
Example 9.4.2.1, every point is attracted to {0} in the sense that ifn € No, then there 
is a p such that R?n = 0. For Example 9.4.2.2, there is no singleton in X = Z 
which serves as an attractor. Nonetheless, for every n € Z, there is a p such that 
RPn e€ {-—3, —2, —1, 0}. The analogues of the tiling systems (a){c) in Example 
9.4.2.1 carry over to Example 9.4.2.2 as follows. 

Case (a): A={(n,0)|n eZ}, and 
E (n, 0) = {n} (the singleton)  forn e€ Z. 
Case (b): This tile system now contains a more varied system 
of tiles for the set Y = Z. The singleton tiles are 
E(n,0) = {n} forn = —3, —2,—1,0, and in ad- 
dition there are the following five classes of (non- 
overlapping) dyadic tiles: 


E (-6, p) for all p € No, 
E (-4, p) for all p € No, 
E (1, p) for all p € No, 
E@, p) for all p € No, and 
E (8, p) for all p € No. 


Case (c): We leave this to the reader. 


In conclusion, the two examples show how libraries of wavelet bases are 
constructed from two entirely different families of representations of the Cuntz 
algebra Oy. 

The occurrence of one of the representations (V;) is not surprising: it is deter- 
mined by the kind of subband filters familiar from both signal processing and the 
standard multiresolution approach to wavelet bases. It is the second representation 
(S;) of On, the permutative representation, that completes formula (9.2.10) for the 
scaling operator U, encodes our wavelet bases and accounts for the variety of tilings 
as they arise in Proposition 8.3.2 and Theorem 8.4.4. As in Proposition 8.3.2, we start 
with an ONB { gy | 7 € No} as in (8.3.8); we apply suitable powers of U to these 
functions, creating the functions U?g,, and then ask which subsets of the double 
indices yield new ONBs. The same principle applies to more general basis construc- 
tions, and Theorem 8.4.4 spells out the tilings that yield bases of wavelet packets. 
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In the systematic study of permutative representations in [BrJo99a] it is shown 
that each permutative representation can be understood from a single endomorphism 
R of the index set. In the next example we introduce R: see formula (9.4.10). 


Example 9.4.2.3. In this next example, we use an arbitrary but fixed scaling number 
N €EN,N > 2, generalizing Example 9.4.2.1 above. We take Y = No, and set 


R(kK+Nn=n fork =0,1,..., NM — 1, and for all nm € No. (9.4.10) 


As in Example 9.4.2.1, one checks that this is a branch mapping, in this case an 
N-fold branch, i.e., there are N distinct branches of the inverse, og (x) := k + Nn, 
k=0,1,...,N—1. For (m, p) ¢ X x No, ¥ = No, one checks the formula 


E(n, p)=[nN?, (n+1)N?). (9.4.11) 
The three types of tilings (a}{c) from Example 9.4.2.1 then generalize as follows. 


Case (a): A={(n,0)|n € No}, and 
E (n, 0) = {n} (the singletons) forn E No. 
Case (b): A={((0,0),0,p),(@, p),...,(N—-1, p) | p € No}, and 
E(k, p) =[KN?,(k+1)N?),k=0,1,...,N-1, peNo. 
Case (c): A = {(0,2), (1, NA), (2, Nk),...,((N? —1), ND | k EN}, 


where now 
E (0,2) = {0,1,2,3,...,4?—1}, and 
E (j, 2k) =[jN*, G + 1) N7*) 
for j =0,1,...,N?7—landkeN. 


In the rest of the chapter, we study some more general features of the n-fold 
branching mapping R which occurred in a special case in Example 9.4.2.3. In our 
general study of permutative representations in [BrJo99a], we refer to R as the 
canonical endomorphism. What makes the next theorem different is that it yields 
an axiomatic approach to libraries of bases in Hilbert space. It applies to the kind 
of iterated-function-system fractals studied in Chapter 8 above and in more detail 
in [BrJo99a], and at the same time it generalizes the pyramid approach to bases of 
wavelet packets; see Proposition 8.3.2, Theorem 8.4.4, and Figure 7.5 (pp. 120-121). 

An ONB {@, | n € No} as in (8.3.8) yields one index set. The second arises 
from the scale-similarity inherent in wavelet analysis, scaling by a power of 2 or N, 
or in several variables by a power of a fixed scaling matrix. The scaling is a unitary 
operator U in the ambient Hilbert space, which may be L? (R? ), and then we create 
the functions U? g,, and ask which subsets of the double indices yield new ONBs. 

The setting in the next result, Theorem 9.4.3, is maximally flexible in that it al- 
lows a completely general ONB and index set X. Once this is given, we may consider 
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the most general permutative representations as in [BrJo99a]. Depending on the ap- 
plication, e.g., an iterated-function-system fractal, a time-series problem from signal 
processing, or an image-processing algorithm, we are then able to build a unitary 
operator, a tensor-product representation, and associated tilings and bases of wavelet 
packets, so generalized libraries of bases of wavelet packets. 

A Special class of iterated-function-system fractal where this works well is the 
affine iterated function systems. In Section 9.6, we show how multiresolutions and 
generalized libraries of bases of wavelet packets may be adapted to these systems. 
Closing the circle from Chapter 4, we begin with the middle-third Cantor set, and 
our associated wavelet families are given in Theorem 9.6.1, so-called space-filling 
wavelet bases. In Section 9.7 we go on to study the wider class of these affine fractals. 


Theorem 9.4.3. Let R:X — X be an n-fold branch mapping, and let (S;) be 
the corresponding permutative representation on € (X). Let K be a Hilbert space 
with orthonormal basis (ONB) B = {\b)}, and let (Vi) € Rep(Qn,K). Set 
U = >7i_1 8; @ V;*, and let AC X x No be a subset. 

Then the following two conditions are equivalent: 


(i) the vectors UP |x @ b) indexed by (x, p) € A and b € B form an ONB for 
€? (X) @ K; and 
(ii) A defines a tiling of X. 


Proof. Letn = {1,...,#}, and set QQ(p) =n xnx--- x nm. Then 
es ! 
p times 
UP |x@b)= >) |S:x)@|V¥7 5), (9.4.12) 
LEQ(p) 


and S;7x € E (x, p). 

(ii) => (4). Assuming (ii), it is clear that the corresponding vectors in (i) form an 
orthonormal family in H := €? (X) @ K. Let y € X and b’ € B. Pick (x, p) € A 
such that y € E (x, p). Then y = S;x for a unique J = (i},..., ip) € Q(p), and 


\y) @ |b’) = UP? |x @ Vb’), (9.4.13) 
where we used the identity 
ViV1 = 671 for J, 1 € Q(p). (9.4.14) 
A substitution of 
Vb = >°(b| Vb’) (9.4.15) 
beB 
into (9.4.13) yields 
|v) @ |b’) = $2 (b | Virb’) Up |x @d). (9.4.16) 


beB 
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Since the vectors |y} ® |b’) form an ONB for H, we conclude that the family (i) is 
total in H: the vectors on the right-hand side in (9.4.16) are precisely the vectors 
listed in (i). 

(i) => (ii). We now assume that the vectors in (i) form an ONB in H. 

Let (x, p), (x’, p’) € A. Then for every J € Q(p) and I’ € Q(p’), S)x € 
E (x, p) and Spx’ € E (x’, p’). Since the vectors U? |x @ b) and U?’ |x’ @ b’) from 
(i) are orthogonal, we conclude from (9.4.12) that (9.4.4) holds, i.e., that the distinct 
sets E (x, p) and E (x’, p’) for (x, p) and (x’, p’) in A are disjoint. 

If (9.4.3) were not satisfied and y € X \ Ue, pc A E (x, p), then it follows from 
(9.4.13) that | y, b’ ) is in the orthogonal complement of the whole family (i). Hence 
(i) could not be total. Since we assume that (i) is an ONB, the proofis completed. O 


9.5 Applications to wavelets 


Let H be the space L? (R) of all L*-functions on R, and let N € N, N > 2, be given. 
Then the scaling operator 


(Uf)\®=V/Nf(N1), feH, teR, (9.5.1) 


is clearly unitary. The purpose of this section is to show that every multiresolution 
wavelet decomposition of H, corresponding to scale number N, gives rise to a tensor 
factorization of U of the form (9.3.2); specifically, U has the form 


N-1 
U=> eV. (9.5.2) 
i=0 
Here the tensor factorization in (9.5.2) refers to 
H => (No) @L7(T)  whereT=R/Z, (9.5.3) 
and where 
(S;) € Rep (On, 2 (No)) and (Vi) € Rep (Ow, L? (T)) (9.5.4) 


Furthermore, the two representations in (9.5.4) are specified as follows. The rep- 
resentation (S;) in (9.5.4) is the permutative representation from Example 9.4.2.3 
above. (If N = 2, it is the special case in Example 9.4.2.1.) 

The representation (V;) from (9.5.4) is defined from a system of functions 
mo, ...,myn_, in L® (T) such that the N x N matrix 


mo(x) — mo(x+4) > mo(x +454) 
my (x) m1 (x +7) bed m1 (x + 494) cee 


my-1(x) mn-1 (x + *) +++ MN-| (x As Nyt) 
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is unitary a.e. x € T. 
We recall the following two lemmas from [BrJo02b]; see also [Wic94]. 


Lemma 9.5.1. Let mo, m1, ...,my—1i be in L© (T), and set 
Vif) (x) = VN mj (x) f (Nx) (9.5.6) 


for f € L* (T),x €R, andi =0,1,...,N—1. 
Then (V;) € Rep (Oy, L* (T)) ifand only if the matrix (9.5.5) is unitary a.e. x € 
T. 


The next lemma is also known, but we sketch it for reference. 


Lemma 9.5.2. The Fourier transform 


fAQ)= ih © gn idaxt f(t) dt (9.5.7) 
realizes a unitary isomorphism 
W: L? (R) X 7 (Z) @ L? (T) (9.5.8) 
as follows: 
cl 2 
(Wh) ():= (7 (x +h) ef (Z). (9.5.9) 


Proof, Initially, Wf (-) is just a function from R mapping into ¢€? (Z). But an appli- 
cation of Parseval’s identity and Fubini’s theorem shows that 


1 & 2 oO) 2 fo) 
| S\fe +5)| dx = | jf @| ax = | If(OP at. (9.5.10) 

0 kez —0o —0o 
Hence the mapping W in (9.5.9) passes to a unitary isomorphism of L? (IR) onto 
L? (T, €? (Z)) = € (Z) @ L? (T). Oo 


Lemma 9.5.3. Let mo,m,,...,mn-—, be in L© (T) and suppose the unitarity con- 
dition (9.5.5) is satisfied. Assume in addition that mo is Lipschitz near x = 0, and 


that mo (0) = JN. 
(a) Then there is a sequence 90, 1, ... in L? (IR) such that 


Onn (Nx) = mo(x)@n(x), 
Onn+i (Nx) m (xX) @n (x), 


(9.5.11) 


Pnn+N-1(Nx) = my-1 (x) Gn (x), 


and 


i; go (t) dt = 1. (9.5.12) 
R 
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(b) The following three additional conditions on the functions 0, 01, 92,-.. are 
mutually equivalent: 


@ [ieo? dt =1, 
Gi) S’\o@+/pjP=1 aexeR, 
jeZ 
Git) SOG +A) Gn +A) =O  ae.xeR. 
jeZ 
Remark 9.5.4. The clearest way to visualize the sequence { g, | n € No} from 
Lemma 9.5.3(b), see especially (9.5.11), is to picture the functions in the case of 
N = 2, for the Haar wavelet construction. In that case, 
G2n (t) = Gn (2t) + Gn (2t — 1), 
Pant (t) = On (2t) — On (2t — 1), (9.5.13) 
90 = Xj0,1) 
The first 32 functions in this sequence are reproduced in Figure 7.4 (pp. 118-119) 


where some features of their construction may be observed, such as the partial sym- 
metry/antisymmetry of g, (in the Haar case) at all dyadic scales: 


on(2* —th=+,(t),  n=0,1,2,..., k=0,1,2,..., ae.te (0,2). 


Theorem 9.5.5. Let mo,m,...,mNn-—, be functions in L© (T) which satisfy the 
conditions in Lemma 9.5.3(b), and let 90, 91, 92, ... be the corresponding sequence 
in L? (R). Then the double-indexed family 


{9n(- —k)|neNo, ke Z} (9.5.14) 
is an orthonormal basis for L? (IR). Set 
ex (x) := exp (—i2zkx), (9.5.15) 
and let (Vix rs be the representation of On defined in Lemma 9.5.1; see (9.5.6). 
Then 
N-1 
VN on (Nt—k) = >° >) (Vie; | ek) onnsi  — J)- (9.5.16) 
i=0 jeZ 


Proof. The fact that (9.5.14) is an ONB in L? (R) follows from Lemma 9.5.3(b). 
When the operators V;,i = 0,1,..., N — 1, are defined from the filter functions mj, 
as in (9.5.6), it follows from Lemma 9.5.1 that (V;) € Rep (Oy, L? (T)). Hence to 
prove (9.5.16), it is enough to show that 


om (@—) VN gn (Nt —k) dt =( Vie; | ex) (9.5.17) 
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for all w € No, and all 7, k € Z. On the right-hand side of (9.5.17), we use the inner 
product in L? (T). 

The proof of (9.5.17) is again based on Parseval’s identity, and a second ap- 
plication of Lemma 9.5.3(b). Starting with the left-hand side in (9.5.17), we set 
n=n, +N +--++npNP!, where nj, n2, --- e {0,1,..., N — 1}. Then 


Gn (x) = mn, (x/N)--- mn, (x/N?) Go (x/N?), 


and 
[exe @=D YN on (tb) dt 
» [aN (Z)a (8) 
= VN fej (Wx) Bxnvs (NR) Gn (HY eu (x) dx 
Sy Te 2; (Nx) mi ) [Gn (x)|° ex (x) dx 


= IN [FORME Y line + Pree 0) a 


leZ 


ln if @j (Nx) m Ge) ek (x) dx 


-{ TO Dm m (HS) a (H) dx 
1 
“ | 8 (ey (Vitex) (x) dx 
= (e; | Vitek) = (Nie; | ee). o 


Since every system mo,...,m™y-—1 Which satisfies the conditions in Lemma 
9,5.3(b) generates functions go, 91,... in L? (IR) such that 


{@n(- —k)|neNo, ke Z} 
is an ONB in L? (IR), the assignment 
gn (- —k) + |n) @ |) (9.5.18) 
extends to a unitary isomorphism 
L? (R) = €? (No) @ &? (Z) & £2 (No) @ L? (T), (9.5.19) 


where we use the familiar isomorphism €* (Z) = L? (T) defined by the usual Fourier 
basis { e, | A € Z} introduced in (9.5.15). 
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Corollary 9.5.6. Let the functions mo,m,,...,mN—, satisfy the conditions in 
Lemma 9.5.3(b), and let 
L? (R) = 2 (No) @ * (Z) 
be the unitary isomorphism defined by the corresponding ONB 
{@n(- —kK)|nENo, KE Z}; 


see (9.5.19) and (9.5.14). 
Let (V;) € Rep (On, €? (Z)) be defined from the (m;) system as in Lemma 9.5.1, 
and wet 


S; |n) = |Nn + i) fori =0,1,...,N—landn Ee No (9.5.20) 


be the permutative representation of Example 9.4.2.3. 
Then the unitary scaling operator 


Uf@:=VNf(NtD) forfeLl?(R),teR (9.5.21) 
has the tensor factorization 


U=> 5 @Y;. (9.5.22) 


Proof. Once we have identified the isomorphism (9.5.18), the result follows from 
(9.5.16) in Theorem 9.5.5. Specifically, 


U (|n) @ |k)) = VN on (Nt —k) 


N-1 
= >>) (Ke; | ek) onnsi  - J) 


i=0 jeZ 
N-1 


2 (2 | Vie) |Sin) ® |e;) 


i=0 jeZ 


=> 3 |Sin) ® |V;*ex) 
j=0 


= > S, @V;* |n) @|k). o 
0 


~. 


= 


~ 
_ 


~ 


9.6 An application to fractals 


In a paper with D. Dutkay [DuJo06b] we showed that a certain class of fractals admits 
a multiresolution analysis (MRA). It is the class of fractals which is generated by a 
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finite system of contractive affine mappings in R?, and it includes the middle-third 
Cantor set C3. 

Rather than treating the most general case here, we will restrict attention to two 
examples, C3 and C4. The Cantor set C3 is the unique compact subset of R (in fact 
of [0, 1]) which satisfies 

3C3 = C3 U(C3 +2), (9.6.1) 


and C4 is the unique compact subset of R which solves 
4C, = Cy U(Cq +2). (9.6.2) 


Both examples are generated by scaling and subdivision into similar “smaller” repli- 
cas, and we define the Hausdorff dimension of the fractals by 


log (number of replicas) 
=, 9.6.3 
log (magnification factor) ( ) 


Furthermore, we note that 


¢(Cs) = log 2) = 5 (9.6.4) 


ti In2_ 1 
=> | SS ——— 
c (C4) = log, 2 ma 5 
Using a standard theorem from geometric measure theory, see, e.g., [Hut81], we 
observe that there are unique probability measures 3 and 444 with supports C3 and 
C4 respectively such that 


[roam =; ( f (5) dua + [ (FH *) ais o) (9.6.6) 
/ SO) dus) = 3 ( i (5) dug (t)+ res (= *) dug o) (9.6.7) 


hold for all bounded continuous functions f on R. 
Set 


(9.6.5) 


R3:= | te R)dneNo, ¢ € {0, 1, 2} such that 


So i 
t= >= 3i and only a finite number of 4’s are 1 > (9.6.8) 


i=—n 


and 
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R4i= {+ R|AneNo, 4 € {0, 1, 2,3} such that 


t= a i and only a finite number of ¢;’s are 1 or 3 . (9.6.9) 
Using the respective Hausdorff measures (dt)° on R we get the following separable 
Hilbert spaces: 


440832) — 72 (Rs. (at) *) (9.6.10) 


and 
OD — 72 (Ra, (ar)'”). (9.6.11) 


In general, we get unitary scaling operators U in these fractal Hilbert spaces 
given by the formula 


(Uf) (t) = /# (replicas) f (Nt) fort ER, (9.6.12) 
where N = the magnification factor. Specifically, for R3, 
(U3f) (t) = V2 f Bt), (9.6.13) 
and for Ra, 
UpQ=V2f(4t) forte R. (9.6.14) 


In each example, we have a system of filter functions (m;) satisfying condition 
(9.5.5) before Lemma 9.5.1. (Recall that condition (9.5.5) is equivalent to the fact 
that the operators V; in (9.5.6) form a representation of the Cuntz algebra Oy.) It fol- 
lows that the two examples are associated to representations of the respective Cuntz 
algebras O3 and Oy acting on L? (T). 

For R3, the three functions are 


(1 +2"), 


m, (z) = 2, (9.6.15) 


mo (z) = 


S| 


1 2 
m2 (2) = =; (1-2 ), 
where we have set z = e?27*. 
For R4, the four functions are 
1 2 
mo()= (Itz 2 
m, (z) =z, 


9.6.16 
m2 (z) = 2°, vee 


ms @) = (1-2). 
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Theorem 9.6.1. (a) There is an orthonormal system (ON) 90, 91, 92, .-- in 
L? (R3, (dt)'°8”) which solves the system of equations 


@3n (t) = Gn Bt) + Gn Bt — 2), 
93n4i (1) = V2 on Bt - 1), 


(9.6.17) 
$3n+42 (t) = Gn Bt) — Gn Bt —2), 
90 = Xe;- 
(b) The functions 
{Qn(- —k) |n ENo, k eZ} (9.6.18) 


form an ONB for L? (R3, (dt)'°37). 
(c) If (S;) € Rep (O3, €7 (No)) and (Vi) € Rep (O3, L? (T)) are the associated 
representations, see Example 9.4.2.3, N = 3, and Lemma 9.5.1, then 


2 
U3= > S @V;. (9.6.19) 
i=0 
Theorem 9.6.2. (a) There is an an orthonormal system 90,91, 92,... in 


L? (Ra, (dt)!/ 2) which solves the system of equations 


P4n (t) = Qn (At) + Gn (4t — 2), 
P4nti (t) = V2 9n (4¢—1), 


an42 (t) = V2 gn (4t —3), (9.6.20) 
P4n+3 (t) = Gn (At) — Gn (4t — 2), 
90 = Xc,- 
(b) The functions 
{on(- —k)|neNo, ke Z} (9.6.21) 


form an ONB for L? (Ra, (dt)'/*). 
(c) If (Si) € Rep (O4, €? (No)) and (V;) € Rep (Og, L? (T)) are the associated 
representations (see Example 9.4.2.3, N = 4, and Lemma 9.5.1), then 


3 
Us = >“, @ V7. (9.6.22) 
i=0 
Proof. We have presented the details in Sections 9.4 and 9.5 so that the proofs of 
the two theorems follow as an easy application. The main point is the observation 
that the two recursive systems (9.6.17) and (9.6.20) admit solutions which satisfy the 
respective orthogonality conditions. But the two Hilbert spaces L* (R3, (dt)'°8s “4 
and L? (Ra, (at)'/ 2) have been defined so that this becomes clear. To see that the 
respective orthonormal systems (9.6.18) and (9.6.21) really are total, one may use 
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the MRAs which we introduced in [DuJo06b]. The multiresolutions in the respective 
fractal Hilbert spaces share the same Hilbert-space geometry with L? (R) which we 
presented in Section 9.5 above. In particular, our tensor factorizations (9.6.19) and 
(9.6.22) for U3 and U4 follow from the idea in the proof of Corollary 9.5.6 above. 0 


9.7 Phase modulation 


The difference between the two examples of fractals in Section 9.6 above has to do 
with phase modulation, i.e., with the operators of multiplication by 


e, (t) = ett fort € R. (9.7.1) 


So far, our wavelet constructions have involved only scaling and translation. The 
phases (9.7.1) have entered only because of their use in understanding translations 
via Fourier duality. 


Lemma 9.7.1. [JoPe98] Let C4 be the quarter Cantor set, and let 44 be the corre- 
sponding normalized measure supported on C4 and satisfying (9.6.7). Set 


na nen | Sin 


J20 


finite sums, n; € {0, 1} ‘ (9.7.2) 


Then the functions {e, | A € A4} restricted to C4 form an orthonormal basis in 
bg (C4, /44). 


Remark 9.7.2. It is known [JoPe98] that there are no more than two orthogonal 
functions e, (for any 2 € IR) in the Hilbert space L? (C3, 113). 


Theorem 9.7.3. Let N = 4, and let { gn | n € No } be the orthonormal sequence in 
L? (Ra, (at)!/ 2) constructed in Theorem 9.6.2. 


Then the family 
{ea@o;(t—b |Ae Ag, j =0,3, ke Z} (9.7.3) 
and 
{ei (t/4o; t-b | Ae Ag, f =1,2, ke Z} (9.7.4) 


is an orthonormal basis (ONB) in the Hilbert space L? (Ra, (dt)'/*). 


Proof. The proof follows from a combination of (i) Lemma 9.7.1, (ii) the discussion 
of the affine fractals in Section 9.6, and (iti) some basic observations of scaling and 
the Hausdorff measure; see also [DuJo06b] and the following list. 

Several observations are needed: 


(1) Ag C R4. 
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(2) Setting c; (¢) = @ + j) /4, 7 = 9, 1, 2, 3, we note that Cy = 19C4 U 12Cy, asa 
non-overlapping union, and 
0 = XC, = XioC4 te XnC4? 
Pi = V2 LeC4> 
(9.7.5) 
02 = V2 toscy> 


3 = XCs - XmCq" 


(3) The Hausdorff measure h!/? restricted to C4 is wa. 
(4) The integration (dt)!/* in the definition of the Hilbert space refers to h!/*. 
(5) Borel subsets of R are h!/2-measurable. 


(6) 
h\/? (B +t) =h'/? (B) (9.7.6) 


and 
h'/? (cB) = Jch'/* (B) (9.7.7) 


for all h!/2-measurable sets B, all t € R, andallc € Ry. 
(7) : 
I f(t) dh)? (t) = 5 ‘a f (tj @) ah’? () (9.7.8) 
holds for all #!/2-measurable functions on R4. 


With this, we leave the remaining detailed verifications to the reader. Oo 


Exercises 


9.1. Let g € L? (R) and suppose there is some sequence (a%),<z, such that 


g(th=2) agQtr—k, teR (9E.1) 
keZ@ 


(a) Show that if fp y (t) dt # 0, then 


> «= 1. 


keZ 


(b) Suppose there are constants c,, cp such that 0 < c; < cz < ov, and 
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1 > Nel? < 


keZ 


2 


dae ( -# 


keZ 


sen > él. 


L2(R) keZ 


Let Vo (=: Vo) denote the closed linear span of {g@ (- — k) | k € Z}. 
Then show that W = W, given by 


(Sat -») = Grex (9E.2) 
keZ 


defines a bounded linear operator from Vo into €? (Z). 
(c) Show that the operator W in (b) has zero kernel, i-e., that 


{fe Vo| Wf =0} = {0}. 


(d) Define an operator Sp on €? (Z) by 


(Sn = V2 >> an—2ebk, 


keZ 


and let U> be the dyadic scaling operator in L? (IR), ie., 


1 t 
Wf) t= =f}. 
ono=1(5) 
Then show that the following intertwining relation holds: 
WU? = Sow on YM. (9E.3) 


9.2. We continue with the assumptions introduced in Exercise 9.1 above. 
(a) Using Exercise 5.7, find a Ruelle operator R which has the function 


h(x) := Av \a\’ (x) := > 


neZ 


Q(x + n)| 


as eigenfunction, i.e., with Rh = h. 

(b) Using the function ¥ <7 |@ (& + n)|° from (a), find an explicit formula for 
the adjoint operator W7: €? (Z) > Vo, when W,, is the intertwining operator intro- 
duced in Exercise 9.1 (see (9E.2)). 

9.3. Let the operators W, U2, and So be as described in Exercise 9.1. 

(a) Show that there is a unique isometry T: Vo > €? (Z) such that 


Ww =(ww)'? T. 
(b) Show that the operator 7 in part (a) satisfies 


W=T(Ww)'”. 
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(c) Show that the operator 7 in part (a) satisfies 
W*W=T* (WW*)T. 
(d) Show that the operator 7 in part (a) satisfies 
TU, = SoT on Vo. 


Hint for part (d): 7 is the isometric part of the polar decomposition of W. First 
use (9E.3) to show that (W* W) Uz = U2 (W* W). Then use the spectral theorem on 
W*W to show that (W*W)!/? Uy = U2 (W*W)'/*; and the conclusion then follows 
from parts (b) and (c). 

9.4, Let (a) and (b;) be two sequences indexed by Z, and define operators So and 
S; in €2 (Z) by 
(Sof) = V2 >> an—reck 


keZ 
and 


(Si6)n = V2 Do bn—24ke- 


keZ 
(a) Show that the following (i) and (ii) are equivalent. 


(i) The Cuntz relations 


1 
S$8; =6,;1 and S°SS* = 1 
i=0 
hold. 
(ii) The three separate identities 


2 >) aan 421 = 50,1, 2° dedeya = 40, >) aedes21 = 0 
k i r 


hold. 
(b) Suppose (a%),<7, satisfies 


2 >) akan 421 = 60,1. 
k 


Setting b, = (-1)* a\~-x, Show that the two sequences then satisfy the conditions 
in (ii). 
(c) Given two sequences (ax) and (b;), define 
U(z):= Ds ae RE Vk 
fo \ OK Fant 


Now show that U (z) is unitary for all z ¢ T if and only if the conditions in (ii) are 
satisfied. 
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9.5. Consider a dyadic wavelet basis in L? (IR) indexed as follows: 
vik @ = 277 y (2k), ike, 


and let W: Vo — €? (Z) be the intertwining operator from Exercise 9.1. Finally, let 
(ek) kez, be the canonical basis in £7, i.e., ex (7) := dg,7. Prove that 


W (wj,k) = Si" Sieg for all j = 1,2,...,andk e€ Z. 
9.6. The discrete wavelet transform 


Let (ax),ez and (b4)zez, be sequences satisfying the orthogonality conditions in 
Exercise 9.4(a). Let F = Sj and G = Sj be the corresponding slanted matrices in 
(7E.2), F defined from (a;), and G from (b,). For x € €* (Z), we say that 


x= Seas Siex (9E.4) 


j=0 keZ 
is the discrete wavelet transform. 

(a) Show that the wavelet coefficients c;,; in (9E.4) are given by the following 
matrix multiplication: 

Crake (Griz) 7 > (GF), xy. (9E.5) 

(b) Carry out the matrix products in (9E.5) in the four-tap case, i.e., when the two 
sequences have only four non-zero terms, say ao, a1, a2, a3 and bo, b, b2, 53. 

(c) In the four-tap case (part (b)) show that each of the finite-dimensional sub- 
spaces S, := span {e_,,...,€-1, €0,€1,.--, ek}, k > 2, is invariant under matrix 
multiplication with F and with G. 

(d) With assumptions as in part (c), show that multiplication with F' and G maps 
S41 into S;, for k > 2. 

(e) In the four-tap case, and representing vectors in Sz as 

a) 
X=] 


show that y = Fx has the following form, matrix multiplication in the four-tap case, 
i.e., matrix times column vector: 


Y-2 = 4X2 + 43x-1, 

y-1 = Aox—2 + a\x_1 + 42X09 + 43x}, 
YO = 40x09 + 41x) + a2X2, 
yi = 4ox2, 
y= 0. 
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9.7. Let H be a complex Hilbert space, and let So, S; be bounded operators in H. 
For j € {0,1,2,...}, set Tj := Ss; 1. Now show that the following two systems of 
conditions (i) and (ii) are equivalent. 


(i) So, 51 satisfy the O2-Cuntz relations, and $5” — 0. 
noo 


(ii) {Tj iF EN, Satisfy the Ooo-Cuntz relations, i.e., 


7 Tk = 6; 41 for all j,k e No, 


2, UT; = 1. 


JENo 


PS.: The assertions in (i) and (ii) about convergence refer to the strong oper- 
ator topology (SOT) on B(H) = the bounded operators on H. Recall that the 
neighborhoods of 0 in B (H) are generated by finite sets {;} of vectors in H and 
é€€ Ry: {7 ¢ B(A) | ||Th;\| < ¢}. Specifically, 55” Pat 0 (SOT) if and only if 


lim 153” h| = 0 forallh € H. 
n—- Ooo 
9.8. Kolmogorov 


(a) Let n € N, and let S, be some set of cardinality n. Let (Ry, (x, y)) be a fixed 
positive definite n x n matrix, i.e., satisfying 


3 > & Rn (x, yey 20 


xESn YESn 


for all (¢x),+es, in C”. Show that there is an n-dimensional Gaussian probability 
distribution v@>-»”) which has R, as its covariance matrix, and with mean vector 
zero. 

(b) Let S be an infinite set, and let R:S x S — C be a positive definite function, 
iLe., satisfying 


PIE as R (xj, xk) & = 0 (finite sum) 
7k 


for all €;,@,... € Candallxj,x2,... €S. 
Prove that there is a probability space (Q, B, v) anda function XY: S —> L* (Q, v) 
such that 
R(x, y)=Ey(X@)XG)) _ forallx,yeS, (E6) 


and Ey (X¥ (x)) = 0. 


Hint: Ifm € N and S, = {x1, x2,...,X,} is a finite subset of S with #(S,) =n, 
then use (a) to pick a Gaussian probability distribution v'-*») of dimension n 
which has (R (x;, xx)) as covariance matrix, and with mean vector zero. 

(c) If S C S’ are two finite subsets of S, and v’, v®’ are the corresponding fi- 
nite-dimensional Gaussian probability distributions from (b), then show that they are 
consistent in the sense of Kolmogorov; see Lemma 2.5.1. 


204 9 Representation duality and multiresolutions 


(d) Set Q := CS = all functions from S into C. For x € S, set zx (@) := w(x), 
@ € Q. Show that there is a smallest c-algebra B on Q for which all the functions 
{ zx |x € S} are measurable. 

(e) Use Kolmogorov’s theorem to conclude that there is a unique probability 
measure v on (Q, 8) such that for all finite subsets S, the marginal distribution of v 
is the measure v* from (c). 

(f) Set _X (x, @) := x (w) for x € S, and @ € Q. Then show that X (x, -) € 
L? (Q, v) and that this random process satisfies the conditions (9E.6) in (b). 

(g) State a special case of the conclusion in (b), S = [0, 1], which is the converse 
implication to that of the Karhunen—Loéve theorem of Exercise 2.7 (pp. 55~56). 


9.9. The following is a corollary to the conclusion in Exercise 9.8. Give a direct and 
geometric proof independently of Kolmogorov’s measure-theoretic construction in 
Exercise 9.8(b). 

Let S be a set, and let R:S x S > C bea positive definite function. Show that 
there is a Hilbert space (H, (- | -}) and a function X: S > H such that 


R (x,y) =(X(x) | X0)) for allx,y €S. (9E.7) 


Hint: Set X (x) = R(x, -), viewing R(x, -) asa function on S for x € S. Then 
use linear span, quotient, norm-completion, and (9E.7) to finish the argument. 


References and remarks 


In an earlier textbook [BrJo02b], we consistently lived with the notational conven- 
tion mo(0) = JN for the frequency response function mo; but I noticed when giving 
lectures about this material to a mixed audience of students in mathematics and in en- 
gineering that the alternative convention mo(0) = I seems to be less confusing. The 
idea is that the entire signal is admitted through by the low-pass filter at frequencies 
near zero, i.e., low frequencies. And the other frequency components of the signal 
are blocked by the mo filter. So in this book we have broken with the convention 
from [BrJo02b]. 

Add to that the confusion about a different notational dichotomy, multiplicative 
vs. additive notation for the torus groups: Frequency 0 in additive notation corre- 
sponds to the point z = | on T in multiplicative notation. This dichotomy is dictated 
by the choice of the alternative meanings of the manifolds T?,d = 1,2,.... Func- 
tions on T¢ may be viewed alternately as periodic functions on R, or equivalently 
as functions on the compact torus T?. 

Although the material in the last exercise, Exercise 9.8 above, serves to walk the 
student through some main ideas of Kolmogorov, this material is also available in 
more detail in a number of books on probability theory. Here we wish to especially 
recommend the presentation [PaSc72] by Parthasarathy and Schmidt. But see also 
Nelson’s paper [Nel59]. 


Appendices: Polyphase matrices and the 
operator algebra Oy 


Born wanted a theory which would generalize these matrices or grids of 
numbers into something with a continuity comparable to that of the continu- 
ous part of the spectrum. The job was a highly technical one, and he counted 
on me for aid. ... I had the generalization of matrices already at hand in the 
form of what is known as operators. Born had a good many qualms about 
the soundness of my method and kept wondering if Hilbert would approve of 
my mathematics. Hilbert did, in fact, approve of it, and operators have since 
remained an essential part of quantum theory. —Norbert Wiener 


PREREQUISITES: Curiosity about functions mapping into the group of unitary ma- 
trices. 


Prelude 


The dialog between engineers and mathematicians is hampered by a substantial dif- 
ference in jargon in the two fields. Often the same idea makes its appearance in the 
two fields but under a different name. A case in point is the representation of a class 
of algebras which in the operator-algebraic community is known as the Cuntz al- 
gebras, see [Cun77], but which in signal processing is known as a dual system of 
subband filters yielding a perfect reconstruction in the signal output. The breaking 
up of a signal (say a speech signal) into a finite number N of filtered and down- 
sampled subbands is called analysis. There is then a dual operation of up-sampling 
and filtering. When the result is then merged (added), we call it synthesis. When 
N = 2, the use of filters (high-pass/low-pass) with a perfect reconstruction is called 
quadrature-mirror (referring to the duality) filters. Even when N is larger than 2, we 
may on occasion abuse notation and still talk about quadrature-mirror filters. 
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So even within mathematics, there are separate themes; and analogously for engi- 
neering! And it does not become any less confusing by the huge variety of terminol- 
ogy and occasionally inconsistent jargon. In order to help organize the presentation 
and the mathematics around engineering themes, we have subdivided the discussion 
into four separate appendices. 


Appendix A: Signals and filters 


Our use in this book of signal-processing diagrams such as Figures 7.7 and 7.14 
(pp. 124 and 132) summarizes this combined process (analysis and synthesis) very 
nicely. While the diagrammatic viewpoint in fact originated in the early days of sig- 
nal processing, the diagrams are much more versatile, and they are now used widely 
in modern applications such as image processing as well. In our present discussion, 
for simplicity, we have stressed signal processing. So in the simplest case, a sig- 
nal (also called a time series) is represented by a sequence x = (x,) of numbers, 
with the index denoting time. For the purpose of devising algorithms, it is natural 
to introduce generating functions of a complex variable z, i.e., X(z) := D1 xnz". 
We have illustrated some wavelet algorithms based on this approach in Figures 
7.8-7.13 (pp. 125—128). One of the advantages of this approach is that a variety 
of analytic steps which on the face of it appear not to be especially computational 
get rewritten in a finite sequence of relatively elementary matrix steps. As empha- 
sized in Figures 7.13 and 7.14, when the operations of down- and up-sampling 
are introduced into filtering models, then the matrices become slanted. Indeed, the 
slanted nature of the matrices in turn is responsible for a speed-up of the computa- 
tions, as reflected in the well-known sparseness of the wavelet matrices; see, e.g., 
[BrRo91, Coh03, Dau92, PaSW99, StNg96]. As a result, the complexity of the steps 
in a computation is reduced from n? to n log n. 

In signal processing, it is convenient to restrict the complex variable z to the circle 
T, ie., setting z = exp(iw), so that z” = exp(inq), and we think of @ as frequency, 
or more generally as the dual variable. Of course, this means that X(z) becomes a 
Fourier series. Engineers refer to X(z) as the frequency response. If numbers (a,) are 
given, the Cauchy product (j7,) of the two sequences (a,) and (x,) is called a filter, or 
a filtering of the signal x, and the function m(z) := >> ayz” is called the frequency- 
response filter. We will often refer to m(z) itself as the filter, or more precisely the 
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filter function. The reader will easily verify that the associated frequency response 
Y(z) of the filtered signal is then the pointwise product Y(z) = m(z).X(z). But we 
shall here use duality much more generally than for the classical duality of time 
and frequency in time series analysis. The general duality we have in mind is thus a 
special case of Fourier duality; see, e.g., [Mal98, Waln02]. 

A main point in this book is to organize the diverse themes (both mathematics 
and engineering) within the framework of Hilbert-space geometry. While mathe- 
maticians tend to create Hilbert spaces of functions, engineers are concerned with 
computations, and hence with transformations of sequences. The sequences could 
be indexed by the integers for time, i.e., time series; or they could be organized as 
matrices with double index (e.g., pixels) as is used for the grayscale numbers for an 
exposure in a digital camera. 

As for the mathematics, the next step is then to break the transformations in 
sequence spaces into smaller algorithmic steps. That is where matrix tricks become 
handy. One such matrix trick centers on what engineers call “the polyphase matrix;” 
see Figures C.1 and C.2 (pp. 214-215). For sequence space we suggest the Hilbert 
spaces €” but we allow flexibility for the indexing. 

As we saw in Chapter 7, the selection for a particular problem of a resolution sub- 
space Vo (and a generating function, a scaling function g) within an ambient Hilbert 
space of functions allows us to set up an isometric isomorphism between a chosen 
Vo and the Hilbert space £2 of sequences. Hence we are faced with a correspondence 
principle. This has been organized into Table C.1. 

In the simplest case, we may take L? (IR?) to be the ambient Hilbert space, and 
the subspace Vp to be the closed linear span of the translates {g (- —n) | n € Z*}, 
where Z denotes the usual rank-d lattice in R4. If there are finite positive constants 
(frame constants) A and B such that 


A< > |\@+b[<B  forallte RY, 
keZd 


we may then define an operator (a frame operator) W: Vo + €? (Z) =: €* as follows: 


r( oy Xn@ (+ — ») = (Xn )neza ‘ 


neZ4 


The reader can check that W is bounded, and invertible with bounded inverse. 


Theorem A.1. Jf U := U“) denotes a chosen scaling operator on L? (R*), jor 
example 

(UF) (x) := |det 4|-'/? ¢ (41x), x eR, 
for some expansive integral d x d matrix A, then it follows that the frame operator 
W: Vo — €? intertwines U with the low-pass synthesis operator s4) on €?; i.e., that 


Sow = wu, 
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Proof. The reader is encourged to work out the details of the argument in Exercises 
A.1 and A.2 below; see especially part (b) of Exercise A.2. However, note that the 
exercises only sketch the outline of proof in the special case of a dyadic scaling op- 
erator. Nonetheless, one easily sees that the general case follows by the same outline 
with only minor modifications. Oo 


Recall that s is the first operator in a whole Cuntz system of isometries, 
{5 |< 7 <ldet4i—1}. 


This system (see details in the next appendix) in turn depends on an associated sys- 
tem of subband filters { m;(-) | 0 <j < |det A] — 1 }, which are now Z4-periodic, 
or equivalently functions on the d-torus T? := R¢,/Z4. 

While the number N from Oy, the Cuntz algebra, is simply the scaling in the 
wavelet analysis of functions on the real line R, i-e., in one dimension, it is a little 
more subtle in higher dimension, i.e., in R?. 

There, a scaling is specified by a choice of a d-by-d matrix A. To be useful 
in wavelet bases, and in the study of scale-similarity, the matrix 4 must fulfill two 
conditions: The entries of A must be integers; and all its eigenvalues 2 must satisfy 
jA| > 1. Then N := |det A] yields the N in Oy. We saw in Chapter 7 that this 
number N is also the number of frequency bands which is needed in a subband-filter 
approach to wavelet bases. 

Strictly speaking, this is true only if we do wavelet bases on R? itself. But if we 
are working on fractals, then the number of frequency bands will typically be strictly 
smaller than |det A]. 


Exercises 


A.1. Let g € L? (R) and suppose that the function 


B 2 
Spt) = >) |G +H] 
keZ 
is well defined, where @ denotes the Fourier transform of g. Suppose in addition that 
there are finite positive constants A and B such that 
A<> |@¢+bP<B, teR, 
keZ, 


holds. 
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(a) Then show that the ansatz 


(Some Oa ») = On)neZ 


neZ 


yields a well-defined linear operator on the closed linear span Dy of the translated- 
function family {g(- —n)|n eZ} in L? (R), and that the following estimate 


holds: 
A> ben? < |] >oxn9 ( - 7) <B> bal’. 


neZ neZ L2(R) neZ 


(b) Using the usual ordering for hermitian operators in Hilbert space, show that 
the estimates in (a) may be rewritten in the following equivalent form: 


2 


BULl<Wtw < A“'I, 


where J denotes the identity operator in the closed subspace Dy. 


A.2. Let W, g, and Dy be as in the previous exercise, with the operator W: Dy > 
£? (Z) =: €* defined as before. 

Suppose in addition that a sequence (ax),<z, is given and fixed such that an asso- 
ciated operator S: €? — €? is bounded, where 


(SX)y 2= J/2 De A2k—nXk- 


keZ 


On L? (R), set U = U2, where 


1 t 
U: t)=—fl{-}, e L?(R ,teR. 
AO=—s(5),  FeP@® 
(a) If the function @ satisfies the scaling identity 


gt) =2 >" ag Qt —k), 


keZ 


then show that the space ‘D, is invariant under the dyadic scaling operator U. 
(b) Conditions as in (a): Show that the operator-intertwining relation 


WU, = SW 


holds as an identity on the subspace Do. 


A.3. With assumptions as in the previous two exercises, find a formula for the adjoint 
operator W*: £2 > Do. 

Hint: Up to the Fourier duality from Table C.1, the answer is multiplication by 
So! followed by (%) 2 Ynez Xn” (- — n). 
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Appendix B: Hilbert space and systems of operators 


Our use of Hilbert space throughout this book facilitates the breaking up of a compu- 
tation or an algorithm into a set of basic algorithmic steps. In fact, it gives a geometric 
meaning to these matrix factorizations, as stressed for example in [Jor03]. Hence a 
signal x = (x,) will be viewed as a vector in the Hilbert space H = €7(Z). But by 
Fourier duality (in this case, Parseval’s identity), we will identify 7/ with the Hilbert 
space of L? functions on T, and it will be clear from the context which meaning 
has. This is the geometric viewpoint used throughout the book, and in particular in 
Figures 7.8—7.13 (pp. 125-128). When some signal x is broken up into a finite num- 
ber, say NV, of frequency bands, the result is then a vector in the N-fold direct (or 
orthogonal) sum of 1H with itself, denoted 71; or equivalently the result is a vector- 
valued function on T. The reader easily checks that the Hilbert norm on 7{y agrees 
with the natural Hilbert norm on the space of all square-integrable functions from T 
into the N-dimensional Hilbert space C’. 

As illustrated in Figures 7.14 and 7.9 (pp. 132 and 126), this operator-theoretic 
viewpoint can be accomplished with a set of N operators, or subband filters. To take 
advantage of matrix algebra, we shall view this operator system as a column matrix 
of operators. As is familiar for scalar matrices, we then note that the Hilbert-space 
adjoint of a column matrix of operators will be a row matrix of operators, the entries 
in the row being simply the adjoint operators from the column. And we note that a 
row operator is conveniently viewed as a single operator from Hy into 1. 

When composing several algorithmic steps, we are faced with operators from Hy 
into itself, or with a composition of such operators. The introduction of N-by-N ma- 
trices with operator entries lets us represent composition of operators on Hy as ma- 
trix multiplication, again facilitating computations. Moreover, the requirement of sta- 
bility from signal processing leads us to favor unitary operators from Hy to Hy. But 
if frequency-response filters are involved, we will (as noted above) be studying matri- 
ces of operators where the N? individual entry operators are multiplication operators 
in H. In the literature of signal-processing engineering (see [StNg96], [GaNSO la], 
[GaNS01b], [GaNS02]), these matrices of functions are called polyphase matrices 
(as in the heading of this group of appendices). So a polyphase matrix is a function 
F from T into the algebra of all N-by-N complex matrices. We say that it is unitary 
if this function F maps T into the group of N-by-N unitary matrices. (Note that 
F(z) will only be unitary for z in T, and not for z in the complement of T within the 
complex plane C.) 

It is the purpose of this appendix to point out that the operator-algebraic frame- 
work of Cuntz algebras from Chapter 9 (see also [Cun77] and [BrJo02b]) is espe- 
cially useful in making the link between the theory of individual filters in a system 
of frequency bands, and operators on the system of vector functions which makes up 
the band under discussion. And we will further be stressing the use of this theory in 
the analysis of the wavelet and fractal algorithms throughout from inside the book. 
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As noted, the two figures 7.7 and 7.14 (pp. 124 and 132) illustrate the concept 
of subband filtering from signal processing, but in the more general context adapted 
here to wavelets and to fractals. In this operator-theoretic approach, we work out 
algorithmic diagrams starting with N functions, say mo, m1, ..., ™y— , on T; and 
we build a system of operators 


Sj = Sn; j€ Zn =Z/NZE {0,1,...,N—1} 
satisfying the following relations: 
SF Sk = od, 41, (B.1) 
> S/S} =. (B.2) 
J 
Definition B.1. If we think of (B.1)}+(B.2) as axioms for operators in Hilbert space 
H, then we arrive at the familiar Cuntz relations On [Cun77] from operator-algebra 


theory; and so subband filtering yields a particular family of representations of Oy; 
see Table C.1 in the next appendix. 


Exercises 


B.1. Letn € N,n > 2, be given, and let (S;)/_, be a system of operators in a Hilbert 
space 71 which satisfies the Cuntz relations. 
(a) Let u:7t — H be a unitary operator, and set 


S; = uS;. 


Show that the system (Sh also satisfies the Cuntz relations. 
(b) Let (S;)?_, be as above. Let (a;,;) € Un (C) = the group of all n x n unitary 


matrices, and set 
n 
f 
S; = >, Qi, j Sj}. 
j=l 


Show that the system (S; Vi also satisfies the Cuntz relations. 
(c) Formulate and prove the implications demonstrating that the conditions in (a) 
and (b) are also necessary. 


B.2. Using the formula in Exercise B.1(b), show that the group U, (C) acts as a 
transformation group on the C*-algebra Oy, i.e., as a group of *-automorphisms. 
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Appendix C: A tale of two Hilbert spaces 


We now tum to Table C.1, a systematic table which will serve as a graphic dictio- 
nary translating between current terminology in mathematics and jargon from signal- 
processing engineering. 


Explanation of Table C.1. The two columns in the table represent the same opera- 
tors, but in different guises, that of time series, and that of frequency functions. Each 
is important, and each has its advantages and its drawbacks. The sequences on the 
left represent time series or signals (where time is the discrete integer index), while 
the associated functions on the right carry frequency information. 

Organization of the table: In the left-hand-side column, we have sequences 
and operations/operators on sequences; and then the corresponding functions and 
operations/operators on functions in the right-hand-side column. Thus, moving down 
the table, each of the listed operators has a representation in each of the two columns; 
i.e., each operator has two incarnations (time and frequency), but they will often 
be denoted with the same symbol. In addition to time/frequency duality, there is a 
second duality, operator vs. adjoint operator. Specifically, each operator, say S, has 
an adjoint operator S*. When S is listed in one line, then the adjoint S* follows in the 
next line. Each of the two is important as each has distinct significance in the context 
of signal processing: The adjoint of up-sampling is down-sampling, the adjoint of 
right-shift is left-shift, and so on. 

Comments on the adjoints: The adjoint of each operator depends on the inner 
product of the Hilbert space in question. On the left and the right, respectively, it is 


(x1y)=DoSne and (S18) = [| FOe@, 


neZ 


where the integration is with respect to the Haar measure on T. Note that 
lim Oe = [ edvaar. (C.1) 
k-> 00 T 


In fact, we may take (C.1) as a definition of the Haar measure on T. Specifically, 
one checks independently that the limit in (C.1) exists for all € €¢ C (T). Introducing 
rotations on T, &, (z) = ¢ (zw) for z, w € T, one checks that 


lim O*é = lim Of Ey, 
k-»00 k-> 00 


which is the invariance characterizing Haar measure. 

The operation of adjoint (from operator theory) reflects the word “mirror” in 
quadrature-mirror filter from signal processing. Recall, taking the adjoint of S twice 
gets you back S itself, i.e., we have 


is** = S, 


as is immediate from (Sx | y) = (x | S*y). 
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Table C.1. Operations on two Hilbert spaces: The correspondence principle. 


xe €2(Z) éeL?(T) 


Vectors in each space 


frequency functions 
E(z)= Dd xnz", zET 


neZ 


time signals (xn) neZ, 


Filters 


e filter by m (z) = >. ayz” 
Yn = Finx) (2) = 2 aX n—k 


e multiplication Mm 


(Mm¢) (2) = m (z)¢ (@) 


Shift operators 


e shift to the right 
(Sx) (n) =x,-1, n ED 


e multiplication by z 
(Mz¢) (2) = z¢ (2), z€T 


The adjoint of a shift operator is also a shift: 


e shift to the left 1 


(S*x) (n) =xnq1, 2 EZ 


e multiplication by z~* = z 
forz € T 


(Mzé) (2) = M,1€ @) =271€@) 
Sampling operations (by V) 


xp ifn=kN 
0 ifNtn 


Down-sampling is the adjoint of up-sampling: 
O= 0" 
(Q9) @) = 


(x) (n) = | (4) @ = (2%) 


O=O" 
(Ox) @) =xnn, nEZ 


Subband-filter operators 
© Sm = JN Mn® 

(Sm) (2) = Vm @)é (2%) 
The adjoint operators: 
oS, = OVN Ma 

(Sm Q= Fe XD mw) Ww) 


wi =z 


¢ (Smx) (n) = /N 2 ak %n-kN 


° (Six) (2) = JN 2 akXk-+aN 
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input of bands output signal 


r 


Fig. C.1. Representation of S = TU*. This and the next diagram illustrate the use of the 
operator relations from Appendix C in synthesis of subbands of a signal, or of a subdivision 
of an image. The two diagrams form a pair. The present first one represents a matrix action 
transforming one band-configuration into another. This is followed by up-sampling applied to 
each strand. There is then a system of translations before the strands are merged and added. 
This diagram is a natural dual version of the next one, Figure C.2, i-e., the analysis step which 
begins with subdivision, or breaking apart an input signal. 


We now resume the discussion of representations of Oy. 
For our reference point, we shall use the particular representation (T;) jeZn of 
On given by 


(T)¢) @ =2/é (2%) 
forz eT, j €{0,1,...,N—1}, andé € L7(T); (C.2) 


and the reader readily checks that the Oy-relations are satisfied, i.e., that 


N-1 
T7T =4j41 and SS. 7,T} =I. (C.3) 
j=0 


In this appendix, we show that there is an alternative approach based instead on 
certain matrix-functions U = (j,k) pkeZy from T into the group of all N x N 
unitary matrices =: Uy (C). 

Introducing operator-valued matrices and adjoints 


S=(So,...,Sv-1),  S=] : jf, (C4) 
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input signal output of bands 


> 


z—(N—1) Y 


i 


Fig. C.2. Representation of S* = UT*. The present two figures together, Figures C.1 and 
C.2, represent equivalent formulations of the respective left-hand and right-hand sides in Fig- 
ure 7.7 (p. 124). But the present figures in fact represent signal-processing algorithms (or 
equivalently wavelet algorithms) that are more versatile than the more traditional polyphase 
matrices: specifically, the matrix functions in our present diagrams may be arbitrary, and be 
of arbitrary size. The size of a polyphase matrix equals the number of subbands which is 
used. But we also allow non-unitary matrix functions. Of course, more general choices of 
matrices in Figures C.I-C.2 will then correspond to more general choices of filters in Fig- 
ure 7.7. These choices are highly relevant, for example for the algorithms based on lifting, 
see [DaSw98]. Our present Figures C.1—C.2 are multiband versions of diagrammatic repre- 
sentations of subsampling/subband-filtering algorithms which appear frequently in the signal- 
processing engineering literature; see for example [WWW1] with text, and [WWW2] and 
[WWW3] with pictures. 


and similarly for (7;) jeZy? WE arrive at the following matrix formulas: 


S=TU* (C.5) 
and 
S* = UT™. (C.6) 


In signal processing, the representation of these two operator identities takes the 
diagrammatic form shown in Figures C.1 and C.2, and the operator matrix U is 
called the polyphase matrix. 

Specifically, in terms of processing diagrams the operator-theoretic formulas 
(C.5) and (C.6) turn into Figures C.1 and C.2. 


Remark C.1. Our use here of the operator notation S and S* is reversed from the 
convention in signal processing. (Conventions: engineers typically set F = S* and 
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F* = S.) Our choice of terminology is motivated by the traditions in operator- 
algebra theory, and by the fact that the individual operators S; will then be isometries, 
while the SF *s are co-isometries. Specifically, SiS. j = 1 for j =0,...,N — 1; and 
each operator S;S* = E; isa projection. Moreover, the projections (£ ;) jeZy Satisfy 


Ej ER = 6;,%E; (orthogonality), (C.7) 
and 
pee (C.8) 
jeZy 


EXPLANATION OF PROGRAMMING DIAGRAMS FROM SIGNAL PROCESSING 
AND THE CORRESPONDING COMPOSITION OF OPERATORS IN HILBERT SPACE: 
GUIDE TO FIGURES C.1 AND C.2. Caution: in general note that we read figures 
with input-output flow diagrams from left to right, and this corresponds to an action 
of operators, but now reading the action of the respective operators from right to 
left. So the first figure, Figure C.1, represents the following operator factorization: 
S = TU*. The direction from the left to the right in a diagram corresponds to the 
flow of the operations performed on signals. Specifically, the diagram in Figure C.1 
begins with input of bands into a box for U* on the left in the figure. This is followed 
by the diagram for the operator T which involves bands that are assembled into a sin- 
gle output signal. In contrast, the second picture, Figure C.2, is the result of taking 
the adjoint of the operators from Figure C.1. So it corresponds to the operator fac- 
torization of the adjoint operator S* = UT*, and it begins with the diagram for the 
operator T* on the left in Figure C.2. The diagram for T* is then followed by a box 
placed to the right-hand side of T* representing the matrix U. The result of matrix 
multiplication by U is a vector output representing a system of banded signals. These 
output signals may then in turn be the input in a new operation representing a new 
and different matrix. The various matrices are also called “gates” in programming. 
The unitary matrices correspond to gates which preserve the energy of signals from 
input to output. 


The next appendix is largely concerned with Figures C.1—C.2, their use, their 
interpretation, and their relationship to both the Cuntz formulas (B.1)}+(B.2) and the 
polyphase-matrix formalism from engineering. 
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Exercises 


C.1. Let n e N,n > 2, be given and let S;,..., S, be a system of bounded oper- 
ators in a Hilbert space H. Then show that the following two a priori estimates are 
equivalent: 


n 2 n 
> Siu} <>o ial? forall uj; eH; (CE.1) 
i=l i=l 
and ‘ 
> |Stol]? < ol? forallo eH. (CE.2) 
i=l 
Note: If the estimate (CE.1) holds, then we say that (S,,...,5,) is a row- 
contraction. 


C.2. Let n € N, > 2, be given, and consider a family of 1-periodic functions 
m1, ..., Mn on R which are assumed in L®. On the Hilbert space H = L? (0, 1) we 
introduce the system 


(Sif) &) = V2m; (x) f 2x), x € (0,1), 


where x +> 2x is understood mod 1. 

(a) Then write a set of necessary and sufficient conditions on the function system 
(m;)}_, for (S1, ..., S,) to be a row-contraction. 

(b) Spell out how in the case n = 2 the conditions in (a) generalize the 
quadrature-mirror filter condition, i.e., the assumption that the matrix 


mi(x) m (x +4) 
m2 (x) m2 (x +3) 


is in U2 (C) for a.a. x, i-e., is a unitary 2 x 2 complex matrix. 
C.3. Verify the assertions made in Table C.1 about adjoint operators, i.e., the formu- 
las for S7,, ®, ©, and the formula © = @*. 
C.4. Using the identification of Hilbert spaces from Table C.1, €? (Z) = L? (T) = 
L? (0, 1) = L* (R//Z), return to the operator 
W:Dy > €* 

from Exercises A.1—A.2; same assumptions. 

(a) Then show that when M := WW’? is realized as an operator in L* (R//Z), 


then it is a multiplication operator, i.e., multiplication by a 1-periodic function on R. 
(b) Show that @ = WW* from (a) is multiplication by the function 


Rath > |g¢+H/. 
keZ 
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Appendix D: Signal processing, matrices, and programming 
diagrams 


We now turn to the Hilbert-space geometry of the polyphase matrices. If (Sj);czy 
and (7;)jcZy iS a pair of representations of On in a Hilbert space H, i.e., both 
satisfying (B.1)+(B.2), then we show that the operator matrix 


U := (S7T;) (D.1) 
is unitary, i.e., U is a unitary operator in the Hilbert space 


Hn :=HO®--- OH. 
—— 


N times 


The convention in (D.1) is that U is an N x N matrix with entries in the algebra B (H) 
of all bounded linear operators in H, i.e., that Uj,; := S*T; for (i, 7) € Zy x Zn. 
The formulas (C.5)}-(C.6) are matrix products; specifically, 


Sj) => TUF, je€{0,...,N—]}; (D.2) 
k 


and 
Si = Dial jJ€{0,...,N—-—1}. (D.3) 


But if U is unitary, we also get T = SU; or, in matrix form, 


Tj = > SeUi,s, (D.4) 
k 


keeping in mind that the products on the right-hand side refer to products in the non- 
abelian algebra B (H) of all bounded linear operators on 7/. Note further that the 
matrix representation of formula (D.1) now takes the form 


U =S*T, (D.5) 


where S* is a column matrix of operators, while T is a row matrix. 


Proposition D.1. Let (S;);<z, be a representation of Oy ina Hilbert space H, and 
let U be a bounded operator in the Hilbert space Htjy = H®---@H. Writing S, 
— ene 


N times 
as before, and T = (To, ..., Ty—1) as row matrices of operators, we claim that the 


following two conditions are equivalent: 


(a) T = SU is a representation of On, and 
(b) U = S*T is unitary in B (Hy). 
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Proof. (a) => (b). Now the assumption is that both S and T are representations of 
Ov. To verify (b), we calculate the (7, /) matrix entry of each of the matrix products 
U*U and UU. Specifically, 


(U*U), , = SST SET; = TFT; = 0,71 
k 


and 
(UU*), = > STS; = S'S; = 6,,;1, 
k 


showing that U is unitary. 
(b) = (a). If (6) holds, we now show that the matrix product T = SU isa 
representation of Oy. Specifically, we get the two Oy relations: 


TT => > UE, GS Uy = > Up Uy = 4,91 
k ol _ k 
Oz, 


and 


DAT = DODD SUE UST 
LSP assy 
_> ss 
= : a 


We now return to the setting of signal processing. In this application, we have 
H = L?(T) = €? (Z), and the two operator systems S and T will be as outlined in 
the formulas 
Sj = Sm, =/NMn,®, 
or 
(S;€) (z) = VNm; (zy & (2") ; 
in Table C.1, and in (C.2) for 7;. The representation (s 7) will be called a subband 


representation, and ( Tj) will be called a base-point representation. 


Corollary D.2. Let (m;) 
and let 


jeZn be a family of bounded measurable functions on T, 


(S)¢) @) = VNmj@)E (2%), zeT, €e€17(T), j=0,1,...,N—1 
(D.6) 
be the corresponding operators in‘H = L? (T). 
Then (Ss 7) jeZy defines a representation of Oy in H if and only if the N* 
functions 
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| ————— 
Uj,j (2) := Ri >) mw), — zeT, (D.7) 
define a unitary matrix for all z & T, i.e, zt (Ui, j (z)) defines a function from T 
into the Lie group Un (C) ofall unitary N x N matrices. 

Moreover, in that case, 


wN =z 


1 ——o 
mj (2) = Ti S27 U(X), «ze T, fe Zy. (D.8) 
J 


Proof. The result is immediate from Proposition D.1 once the two formulas (D.7) 
and (D.8) have been established, subject to the stated assumptions. 
Verification of (D.7): We have 


(S*T))€ @) = YL m (w) (7) Ww) 


1 
VN wN =z 
= a pz mi) wie (2) 
= U;,; (z)¢ @) for z € T, 


proving (D.7). 
Verification of (D.8): By Proposition D.1, we have T = SU, or 


Té (2) = (SUE) @) =U (2) SEG), 
for z € T, so (using unitarity) we get 


(sy @) =U (2) TOe, 


or 
(Sig) @ = Uji (2%) FE@ 
J 
= Uji (2%) Zé ZN 
THe) 
= Nm @)é(z"), forz € T, 
proving the result. o 
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Exercises 


D.1. Let #1 be a Hilbert space, and let n € N, n > 2, be given. Suppose a system of 


isometries S|, ..., S, satisfies the Cuntz relations in H. Set H™ :-=H®---@H. 
— 
n times 
(a) Then show that 


B(H) 3 X +5 (S#XS;) eB (1) 


defines an isomorphism of the respective «-algebras of all bounded linear operators. 
(Note: By *-isomorphism, we mean that a is bijective, and further satisfies 
a (XY) =a (X)a (VY) and a (X*) = a (X)* for all XY, Y e B(W).) 
(b) Find the inverse 


B=a7!:B (1) > BH). 
Hint: If 4 = (4;,;) ¢ B (H™), show that 


B(A) = > 541,555 
ij 


References and remarks: Systems theory 


The reader will notice that our formulation in Proposition D.1 above is “pure” Hilbert 
space geometry: it is about linear operators in abstract Hilbert space, i.e., it is operator 
theory. The proposition shows that every system of isometries (S;) in a Hilbert space 
H. subject to the Cuntz relations may be rewritten as a single unitary operator U in 
a direct-sum Hilbert space, i.e., as a unitary operator matrix. By this we mean that 
the matrix-entries in U are operators in Ht. Then in Corollary D.2 we specialize to 
the case when the Cuntz system is of the kind that is used in subband filtering. In 
this case we note that the corresponding unitaty operator matrix U then has entries 
which are multiplication operators. So these are special operator matrices, so-called 
polyphase matrices. 

But it is worth stressing that the use of systems of isometries (S;) subject to 
the Cuntz relations is ubiquitous in the analysis of problems involving some kind 
of branching, and a feature involving scales, or a notion of similarity which can be 
couched in terms of operator theory. A case in point is the formulation of scattering 
theory due to Ralph Phillips and Peter Lax. In that theory, there is a single unitary 
operator, or a one-parameter group of unitary operators; the parameter is for time. 
The isometries in the theory then arise by restriction to a chosen pair of subspaces 
which are invariant forward in time and backward in time, respectively: so-called in- 
coming and outgoing subspaces. As it turns out, the scattering then “takes place” in 
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the ortho-complement of these subspaces. This is “obstacle scattering:” the obstacle 
is modeled on the ortho-complement of these subspaces. It turns out that the sub- 
spaces in the Lax—Phillips theory are examples of the kind of multiresolutions that 
we have emphasized in the present book. 

A number of authors (J.A. Ball, S. Cora, W. Helton, V. Vinnikov, and others; 
see, e.g., [BaC V05] and [BaVi05]) have extended Lax—Phillips scattering to systems 
theory, a branch of engineering. Our discussion of the dual pairs of Hilbert spaces in 
Table C.1 applies to this setting as well. Further, it is interesting to note that the fam- 
ily of isometries used in systems theory may also be modeled on the Cuntz relations, 
i.e., that they yield different classes of representations of the Cuntz algebra. Hence 
the general framework of the present appendices applies to this branch of systems 
theory as well. 

It should be noted that our operator systems from Exercises C.1 and C.2 arise 
in a number of places in mathematics, and have already found many uses. Here we 
mention only their use in operator theory as row-contractions (see, e.g., [Arv04], 
[Pop89]) and in wavelet theory (see, e.g., [RoSh97], [She98], and the papers cited 
there). In the context of [RoSh97] these authors have the functions and the operators 
from Exercise C.2 as part of what they call the Unitary Extension Principle (UEP). 
As demonstrated in [RoSh97], the types of generalized wavelets which may be gen- 
erated via use of the UEP have a number of desirable properties. 


Afterword 


Comments on signal/image processing terminology 


Introduction 


A Google search on the word “wavelet” yields 2,650,000 entries, “fractal’’ 8,530,000, 
and “signal processing” 45,000,000, as of September 2005. While this may seem 
staggering, it is wise to keep in mind that by now general ideas in the subject which 
do originate with wavelets are presently used in many sciences outside of mathe- 
matics. While the general subject may be said to have a mathematical core, the vast 
number of applications and independent tracks followed in the last two decades have 
lead to an amazing diversity, even when counting only the mathematical trends. 

This diversity of ideas is both exciting, but at the same time, it often leaves stu- 
dents confused. Ideas which started with wavelets turned out at the present time to be 
significant elsewhere, for example in computational mathematics, in engineering and 
in computer science; in part because of their algorithmic efficiency. Another reason 
is the prevalence of time-space scale similarity in both science (e.g., in physics of 
large scale and physics at the quantum level) and in engineering. While the notion of 
scale-similarity is not always made mathematically precise when it is used outside of 
mathematics (for example in finance!), it is crystallized especially nicely in wavelet 
theory through the notion of a “multiresolution,” a concept from optics. This scale 
similarity may explain why wavelets keep getting reinvented outside mathematics. 

While the mathematical concept of wavelet dates back (in a special case) almost 
one hundred years, to Alfred Haar (mathematics) and Oliver Heaviside (signal anal- 
ysis), the potential of the interaction between mathematics and signal processing was 
only realized much later. It is only the scientific and engineering developments since 
the mid-1980s that have accounted for the impressive growth which is now reflected 
in the gigantic totals we find in typical Google searches. 

There are at least three reasons for the resurgence of activity since 1980: one 
comes from signal processing in engineering (e.g., wireless communication and data 
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mining); another from image processing (e.g., digital cameras, JPEG 2000, finger 
prints) and medical imaging (vision, optics and more); and a third from statistics and 
telecommunication (quantization, filtering of noisy signals). While the mathematics 
of wavelets continues to be both very active and exciting, it is probably all the inter- 
disciplinary connections and developments which account for the bulk of the activity 
in the general area. 

While interdisciplinary developments are wonderful and often take place along 
parallel tracks, they have a tendency of making it difficult for students, for users such 
as engineers and computer scientists, and for researchers to get started in the subject, 
and to get a good sense of the main trends. They are busy and look for friendly 
introductions to a particular main trend, and each trend has its own scientific culture. 

There are several reasons for why it may now be hard for a student to know where 
to begin: The different disciplines have vastly differing lingo, probably reflecting the 
fact that they have quite different aims. 

For the mathematician who is used to seeing wavelets as a part of function theory, 
it may be hard to accept that many users in digital signal/image processing and its 
applications are mainly interested in numbers and in algorithms. Because of the “cul- 
tural” differences between the diverse subjects, the connection between the function 
theory on one side and the practical algorithms which process numbers on the other 
is not always transparent. 

Similarly, we noticed that colleagues from engineering (computer and electrical 
engineering, industrial engineering, data mining, etc.) find the version of wavelet 
theory in typical mathematics books on function theory rather forbidding. 

While all the diversity of applications is truly impressive, the student from math- 
ematics or from one of the other disciplines involved often has difficulties in getting 
started. This difficulty is not because of a shortage of books, but rather because of an 
apparent divergence of the various trends, developments and applications. 

In this book we have aimed at remedying some of this. To make the book useful 
for diverse audiences, we have approached the task from two sides, and we have 
added a discussion of some key terms below. 

On the one hand, each chapter (including exercises) has elements of tutorial, and 
a practical side as well. On the other hand, we have included separate sections which 
serve to “translate” between the lingo which is used in the different, and sometimes 
disparate disciplines. In this endeavor we make an effort to both explain concepts 
from engineering and statistics to mathematicians, and also to explain the various 
areas of mathematics to engineers and scientists from the neighboring disciplines. 

What follows are two small sections dealing with how ideas and terminology 
are used differently in mathematics and in its applications: First we discuss technical 
terms from inside the book and stress how they are used quite differently in engineer- 
ing and in (relatively) pure mathematics. (While perhaps confusing, this dichotomy 
is hardly surprising since the two communities have quite different aims.) Mindful 
that technical lingo is viewed and used differently by engineers and mathematicians, 
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we have tried to present the concepts and ideas evenhandedly from the perspectives 
of the two communities. 

Secondly, we briefly hint at the use of some central themes from the text in 
“Computational mathematics.” This expands on our initial presentation of related 
ideas in the Glossary (a systematic discussion of terminology in the front matter 
above, pp. xvii-xxv). Both this “Computational mathematics” section and the Glos- 
sary serve to motivate central ideas used in the text. 

The Glossary is arranged as a table. In it each underlying idea has up to four 
incarnations, hence the different terms! We outline item by item how each concept 
is used in up to four related but different ways; in four contexts: in mathematics, 
in probability, in engineering, and in physics. We hope to clarify (a little) how the 
distinct aims of the four subjects are reflected in a variety of ways, for example in 
the fact that there is often a multitude of names for what is essentially the same term. 


JPEG 2000 vs. GIF 


We begin with two engineering terms of a more recent vintage; both are from image- 
processing. They help us to strike a contrast between two themes in our book, wave- 
lets vs. Fourier. Specifically, there are some practical features unique to A-to-D quan- 
tization (analog-to-digital) for wavelets (e.g., localization, algorithms, computability, 
and resolutions) which are absent in analogous Fourier tools. For much more detailed 
discussions, see, e.g., [JPEG00], [JaMRO1] and [Mey05]. There is a huge engineer- 
ing literature on this, and we only scratch the surface. 


JPEG 2000 


(Joint Photographics Experts Group)—A new image coding which is largely wavelet 
based. Applications include digital cameras, remote sensing, image archives, scan- 
ning, and medical imaging. 

It comes in different parts: Part 1 offers both lossless and lossy compression and 
provides much better image quality at smaller file-sizes. Part 6 is aimed at compress- 
ing scanned color documents containing both bi-tonal elements as well as images. 

Fundamentally, JPEG 2000 is built much like a certain AI computer-vision 
model, following closely the biology of “human vision.” See [Mar82]: Eye focus, 
zoom, and scales of detail organized as visual resolutions. 

In engineering terms, image-input onto pixels contains information-detail at all 
scales; the detail levels separate the resolution scales, and this hierarchical structure 
can be digitized. Analogously, the human eye (as well as digital cameras) processes 
intensity changes at a variety of resolution scales, much like in the pyramid algorithm 
used in wavelets (where “intensity changes” are wavelet coefficients). Hence, images 
are digitized taking advantage of scale-similarity in recursive processing, and using 
iterated matrix powers, i.e., integral powers of the slanted wavelet matrices outlined 
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in Chapter 7. In this form, the (wavelet) algorithm is thus a recursive matrix algo- 
rithm, and it takes advantage of a key feature of wavelet-scales: each degree of scale 
is a level in the corresponding combinatorial pyramid. Further, it is /ocalized, so it 
does not make the redundant infinite additions that are notorious in Fourier approxi- 
mations. 

JPEG is commonly known as a format for compressing, digitizing and storing 
color images. It supports a number of colors, in the millions: and it can reduce file 
sizes to about 5% of their original size with only small amount of lost data, i.e., 
“small” relative to detection by the human eye. The latest JPEG2000 promises to 
compress images 200 times with better resulting quality than earlier versions based 
on Fourier waves. The key technology enabling such improvement involves switch- 
ing to wavelets, away from the earlier Fourier waves that were used in discrete cosine 
transform (DCT). 

From [JPEG00]: 


The coder is essentially a bit-plane coder, using the same Layered Zero Cod- 
ing (LZC) techniques which have been employed in a number of embedded 
Wavelet coders. Key additions are: 

e The use of fractional bit-planes, in which the quantization symbols for 
any given quantization layer (or bit-plane) are coded in a succession of 
separate passes, rather than just one pass. 

e A simple embedded quad-tree algorithm is used to identify whether or 
not each of a collection of “subblocks” contains any non-zero (signifi- 
cant) samples at each quantization layer, so that the encoding and decod- 
ing algorithms need only visit those samples which lie within subblocks 
which are known to have significant samples. 


GIF 


(Graphics Interchange Format)—Largely “Fourier based.” 

A compressed file format used for storing and transmitting color graphics. The 
GIF model is older, and it is more widely used (up to now!), but at the same time it 
is more restricted in capabilities than JPEG 2000; more limited with respect to the 
number of colors (a few hundred) and compression degree. 

With the latest advances with wavelets in digital image processing I imagine new 
titles in the next generation of horror films, such as: 


He Came at Breakfast Time and Left at 8:30 p.m. 
Marooned in Space with a Digital Camera Fanatic 
Digital Photographers from Deneb 

Revolt of the Memory Sticks 

Countdown to jpeg 

Digital Downloads/Uploads from Hell 
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Grayscale 


This is a notion from digital image analysis. In image processing, for a given image, 
a value is assigned to each subdivision square (called “pixel’’). It is a certain sample 
number: A single number for each pixel! 

In general, the displayed images themselves are typically composed of shades 
of gray, varying from black at the weakest intensity to white at the strongest. For 
color images, each of the three primary colors is assigned pixel intensities in the 
same way. This amounts to an encoding of various colors with different intensities. 
A mixing then becomes part of the digital processing. Grayscale images are distinct 
from black-and-white images, which in the context of computer imaging are images 
with only two colors; grayscale images have many shades of gray in between. 


Quadrature-mirror filter 


“Quadrature-mirror filter” is engineering jargon, and signal-processing engineers 
aren’t exactly poets. [From the online Oxford English Dictionary [OED]: “quadra- 
ture. (a). Math— The action or process of squaring; spec. the expression of an area 
bounded by a curve, esp. a circle, by means of an equivalent square. More widely, the 
calculation of the area bounded by, or lying under, a curve. (b). 1942 H. M. BACON 
—Differential & Integral Calculus: The desire was to find a square equal in area to the 
area bounded by the given curve (in this case, a circle). For this reason the problem 
has been called the problem of quadrature.”] 

Signal-processing engineers noticed early in the subject that the conditions im- 
posed on each of the two filter functions, say mo and m, that are used in the anal- 
ysis/synthesis of speech signals into two subbands (see Figure 7.14, p. 132) involve 
sums of squares. Similarly the functions themselves mo and m, satisfy a sum-square 
rule; or you could say, a circle-rule. 

These quadratic conditions are summarized in the text around (9.2.12). Of course 
this is also what allows us to get the unitary matrix functions (of the Appendices) into 
the game. Or equivalently, the representations of the Cuntz relations from Chapter 9 
serve to create a geometric framework for the popular pyramid algorithms of wave- 
let/fractal subdivision algorithms. 

The essential conclusion in the Appendices above is that the so-called “quadra- 
ture conditions” from signal processing, or equivalently from wavelet filters, take a 
simple equivalent form in terms of unitary matrices. See also [Jor03]. Each instance 
of a sum-of-squares rule suggests a circle law, and that is perhaps the root of this en- 
gineering terminology. Actually for the engineers, it is the finite set of numbers (also 
called masking, or wavelet coefficients) that go into the filters that are of more im- 
mediate interest, and the sum-of-squares rule for the filter functions merely reflects 
a quadratic system of equations for the wavelet coefficients, see (1.3.20). They are 
the frequency response to the system (1.3.20). The related condition (1.3.21) refers 
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to the low-pass condition which characterizes the first signal-processing filter in a 
subband system. For example the four numbers (alias, the four taps) that define the 
Daubechies wavelet (see Chapter 1, Example 1.3.3, and (7.6.6)) result as one of the 
solutions to this quadratic system of equations. 

Additional discussion of the engineering term “quadrature-mirror filter” and its 
relation to “mirror” can be found on the website [WWW4]. It explains the quadrature 
part differently; i.e., in the following sense of “quadrature:” It is the frequency point 
n /2, the “quadrature point” that divides the circle into four parts. 


What is a frame? 
To the mathematics student: 


It is a generalization of an orthonormal basis (ONB)! Let (5;), i in some index set, 
be an orthonormal basis for a Hilbert space. To understand the generalization, begin 
with Parseval’s familiar identity for an ONB: 


eI? = Dey |x). (*) 


More generally, if a system of vectors (b;) satisfies (*), it is said to form a tight 
frame, or a Parseval frame. But it need not be an ONB: You might have ||d; || < 1, 
ie., the norm is “too small” And redundancy in the system will then smear out 
orthogonality. 

Or it could be even worse: If instead of an identity in («), you only have estimates, 
with two positive constants 4, and B: The term 4 ||x ||? as a lower bound, and B |\x ||? 
as an upper bound for 5°; |( 5; | x )|?; then you talk about a general frame with A 
and B as frame bounds. 

It is easy to rewrite all of this in terms of dilations from operator theory: Frames 
(5;) in a Hilbert space 7 are all of the form b; = T (c;) where (c;) is an ONB in an 
expanded Hilbert space K, and 7: — HH a bounded invertible linear operator. 

The Parseval frames, for example, correspond to the case when this operator 7 
is the projection of K onto H. 

So the notion of a “frame” arises from relaxing the standard requirement on a 
“basis.” While at first this particular term “frame” might appear somewhat as a math- 
ematical technicality or a mere curiosity, in fact we have seen in the past decade that 
the subject “frames” emerging in leaps and bounds as a substantial mathematical dis- 
cipline in its own right: one with a real presence both in the book literature (see, e.g., 
{Chr03] and the papers cited there) and in research journals. Mathematically there are 
good reasons to relax the more stringent axioms that have previously been used for 
bases in infinite-dimensional function spaces; and much of this has been motivated 
by needs dictated by wavelets. Other motivations are drawn from engineering. 
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It turns out that it is much easier to generate “wavelet bases” which are only 
frames, as opposed to orthonormal bases (ONBs). Moreover, a number of wavelet 
problems in numerical analysis rely on easy and readily available libraries of wave- 
let bases. For this, the “library” of ONB wavelets is too small! Some of the frame 
wavelets have been known in applied mathematics for a generation or more under 
the name of “splines;” see [Chr03] for details. Add to this the fact that now, both 
in wavelet mathematics and in telecommunications engineering, it has proved prac- 
tical to give up the stringent and restrictive axiom of ONBs. What is needed is a 
notion of “basis” which accommodates some degree of redundancy in the process 
of synthesis. (By synthesis we mean the expansion of a vector in a (frame) “‘basis;” 
i.e., reconstructing the vector from its base coefficients.) Our Chapter 6 above illus- 
trates this point by example, and for the simplest wavelet of them all. The issue in its 
general form is closely tied into the role played by scale-similarity. 


To an engineer: 


Frames are used by signal-processing engineers, and they are motivated by practical 
concems regarding transmission and measurement in telecommunication. Many of 
the non-trivial engineering applications and uses take place in finite dimensions. 

The name frame has to do with measurements within a visual “frame,” referring 
to an instrument. For mathematical reasons, we know that an “honest” basis, or an 
ONB, would escape “out of” the frame! 


Alias (aliasing) 
Engineering: 


(1) In signal processing, the effect that causes different continuous signals to cre- 
ate multiplicities or to become indistinguishable (or aliases of one another) when 
sampled. 

(2) In computing, the indirect (usually unexpected) effect on other data when a 
variable or reference is changed; typically referring to multiplicities of the original 
data. 


Mathematics: 


This notion of “aliasing” from signal processing is also current in such areas of math- 
ematics as operator theory and harmonic analysis. It is used in connection with prob- 
lems in a Hilbert space, say 71, when a super-structure is constructed in the form 
of an ambient dilated Hilbert space K,, and an isometry embedding 7 into K.. Then 
aliasing is seen as the effect on an initial signal resulting from oversampling. The 
signal could be a vector in H, or a frame basis for L? (R“). 
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We introduced the term sampling in Chapter 1 above. The sampling could be 
done with a selection of points on a lattice in R@ (i.e., a rank-d discrete subgroup in 
R?), and oversampling then refers to sampling points in a bigger lattice. 

When oversampling is applied to a continuous signal (realized in 71), it creates 
one or more different signals (aliases, or multiplicities) indistinguishable from the 
original signal; and it is then possible to realize the aliases in the dilated Hilbert space 
kK. Hence some of our results in Chapter 6 and in [DuJo06c] make aliasing precise 
via a super-structure. This is a dilated Hilbert space, and it arises, for example, in the 
context of affine wavelet frames in L? (R“). 

In this setting, there is a procedure in which sampling in an initial frame for 
L? (R®) yields a dilated Hilbert space K and a super-frame in K. Thus, in a sense 
“nothing is lost” in oversampling: The oversampled frame is a corner of a third frame, 
a super-structure, or an associated super-frame. The sampling setting in [DuJo06c] 
admits a concrete representation of the Hilbert space K as a direct sum of a finite 
number of copies of L? (R4), the number of copies depending on the amount of 
oversampling. 


Computational mathematics 


On a personal note, after teaching some of the present material in courses, this author 
has come to appreciate the usefulness of Hilbert-space geometry and operator theory 
in addressing even such practical problems of calculating wavelet coefficients with 
iterative and fast matrix multiplication algorithms. 

In fact, [ have become much more optimistic than I used to be about the practical 
and the computational usefulness of relying on even singular wavelet filters of one 
kind or the other, in computation. 

One of the things coming out from Chapter 7, the Appendices above, from my 
teaching of wavelets, and from my work with engineers is the usefulness of Hilbert- 
space geometry for iterative algorithms. Thus, once we agree to stay in a fixed reso- 
lution subspace in L* (IR”), then we can do wavelet expansions without ever having 
to calculate wavelet coefficients the slow way, i.e., by integrating over IR”, or over 
a subset of it. The trick is to compute wavelet coefficients instead with the use of 
suitable matrix iterations, and then using slanted matrices (see Figure 7.13, p. 128) 
which are computationally much faster. For more detail, we refer the reader to the 
series of figures in Chapter 7, see especially Figures 7.7, 7.8, and 7.9 (pp. 124-126), 
and Figure 7.13. At first, I was not sure that it is possible to avoid integration, but 
it is! 

In concrete cases of image processing, we note that all the pictures you come 
across on the web showing iterative image processing of Lena or of some other digital 
image (see, e.g., [WWW3]) are done precisely with the standard matrix iteration 
algorithm for wavelets. “Computers can’t integrate!” 
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What further emerges by a closer scrutiny of these iterative algorithms is that 
Cuntz (or Cuntz-like) relations are just perfect for this; see the discussion in the 
Appendices above, especially (B.1)(B.2) and Table C.1. Of course, with hindsight, 
we now know that these relations have their root in old ideas from signal processing; 
see [Jor03]. 

The only technical modification needed for all of these iterative schemes to work 
for such singular filters that are used in fractals of one kind or the other (Chapters 
7-8) is that the slanted matrices might now become infinite, but their slanted nature 
makes this infinity only a mild nuisance. Engineers are good at devising threshold 
schemes for truncating the infinite, as long as it is under control; and as we saw, it is 
for this purpose. 


Epigraphs 


Quotation from John von Neumann on page vii above the Preface: This quote is popular on 
web pages about von Neumann, and about computing and mathematics generally. It is 
apparently not from a published work of von Neumann’s, but Franz L. Alt recalls it as 
a remark made from the podium by von Neumann as keynote speaker at the first na- 
tional meeting of the Association for Computing Machinery in 1947. The exchange at 
that meeting is described at the end of Alt’s brief article Archaeology of computers: 
Reminiscences, 1945—1947, Communications of the ACM, vol. 15, issue 7, July 1972, 
special issue: Twenty-fifth anniversary of the Association for Computing Machinery, p. 
694. Alt recalls that von Neumann “mentioned the ‘new programming method’ for ENIAC 
and explained that its seemingly small vocabulary was in fact ample: that future comput- 
ers, then in the design stage, would get along on a dozen instruction types, and this was 
known to be adequate for expressing all of mathematics.... Von Neumann went on to say 
that one need not be surprised at this smal! number, since about 1,000 words were known 
to be adequate for most situations of real life, and mathematics was only a small part of 
life, and a very simple part at that. This caused some hilarity in the audience, which pro- 
voked von Neumann to say: ‘If people do not believe that mathematics is simple, it is only 
because they do not realize how complicated life is.’” 


Quotation from David Mumford on page xv: David Mumford, The dawning of the age 
of stochasticity, Atti della Accademia Nazionale dei Lincei, Classe di Scienze Fisiche, 
Matematiche e Naturali, Rendiconti Lincei, Serie IX, Matematica e Applicazioni, vol. XI, 
2000, special issue: Mathematics Towards the Third Millennium (Papers from the Inter- 
national Conference held in Rome, May 27-29, 1999), p. 107. This is quoted from the 
beginning of Mumford’s article, which offers a fascinating and refreshing view on think- 
ing as Bayesian inference, and the use of probability spaces in image processing. 


In the “Getting started” section of the front matter, on page xv, we “sort of quote” from 
G.H. Hardy, A Mathematician’s Apology, Canto, Cambridge University Press, Cambridge, 
1992, with a foreword by C.P. Snow (first edition 1940, and latest edition 1992), in which 
sixty-six years ago Hardy so eloquently apologized to the World for mathematics. Back 
then Hardy, the Platonic puritan he was, had in mind pure mathematics: at the time, some 
parts of applied mathematics had been used in an unpopular war. In my opening paragraph 
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of the present “Getting started” section, I could not help wondering if in the meantime 
the winds could have changed; wondering whether perhaps now a mathematics author 
who trespasses into engineering topics and other applied domains might not be expected 
to apologize; at least if he/she has in mind mathematics students as his primary audi- 
ence. Aside from this, Hardy’s lovely little book has become a paradigm for mathematical 
apologies, and any apologetic mathematician ought to at least mention Hardy’s A Mathe- 
matician’s Apology in her credits. 


Quotation from Lewis Carroll on page xvii: Lewis Carroll, Through the Looking-Glass, The 
Complete Illustrated Lewis Carroll, Wordsworth Editions, 1991, page 196. 


Quotation from Seren Kierkegaard on page xxxvii in the “Getting started” section of the front 
matter: Sgren Kierkegaard, Journalen JJ:167 (1843), Soren Kierkegaards Skrifter, Soren 
Kierkegaard Research Center, Copenhagen, 1997-, vol. 18, p. 306. Thanks to Karsten 
Kynde, Soren Kierkegaard Forskningscenteret, web page http://www.sk.ku.dk/citater/, for 
the Danish text of the quotation and directions to its location in print. The English trans- 
lation of the long quote is my own. The Danish short form is due to Julia Watkin; see the 
web page http://www.utas.edu.au/docs/humsoc/kierkegaard/resources/Kierkquotes.html . 


Quotation from Stephen Hawking on page xliii in the Acknowledgments: The quote here is 
from page ix in Stephen Hawking’s book On the Shoulders of Giants: The Great Works 
of Physics and Astronomy, Running Press, Philadelphia, 2002. This book in turn is a col- 
lection of reprints of original classics in the sciences. The book title “On the shoulders of 
giants” is a quote from Isaac Newton. 


Quotation from Edward B. Burger and Michael Starbird on page 1: Edward B. Burger and 
Michael Starbird, Coincidences, Chaos, and All That Math Jazz: Making Light of Weighty 
Ideas, W.W. Norton & Company, 2005. Quoted from the Front matter: Opening thoughts. 


Quotation from Lewis Carroll on page 9: Lewis Carroll, Through the Looking-Glass, Chapter 
3, The Complete Illustrated Lewis Carroll, Wordsworth Editions, 1991. 


Quotation from C.M. Brislawn on page 22: C.M. Brislawn, Fingerprints go digital, Notices 
of the American Mathematical Society, vol. 42, 1995, p. 1278. 


Quotation from Piet Hein on page 27: Piet Hein, “Problems,” Grooks, Borgens Forlag, Copen- 
hagen, Denmark, The MIT Press, Cambridge, MA, 1966; Grooks 1, Doubleday & Com- 
pany Inc., Garden City, NY, General Publishing Company Limited, Toronto, 1969; Col- 
lected Grooks I, Borgens Forlag, Copenhagen, Denmark, 2002. 


Quotation from Lewis Carroll on page 39: Lewis Carroll, Alice's Adventures in Wonderland, 
Chapter VI, The Complete Illustrated Lewis Carroll, Wordsworth Editions, 1991. 


Quotation from P.A.M. Dirac on page 58: The Development of Quantum Theory (J. Robert 
Oppenheimer Memorial Prize Acceptance Speech), Gordon and Breach Publishers, New 
York, 1971, pp. 20-24. The quote used an an epigraph on p. 2 in Operator Commutation 
Relations by Palle E.T. Jorgensen and Robert T. Moore, D. Reidel, Dordrecht, Boston, 
1984, is an abridgement of the above passage. A longer excerpt (with the curious substitu- 
tion of “commutation” for “noncommutation” where it first appears) is presented as an epi- 
graph in the announcement of a “Program on Noncommutative Algebra” at the Mathemat- 
ical Sciences Research Institute, http://www.msri.org/activities/programs/9900/noncomm/. 
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Back when I thought about what to put into Operator Commutation Relations, I relied a 
lot on The Historical Development of Quantum Theory by Jagdish Mehra and Helmut 
Rechenberg, Springer-Verlag, New York, 1982— . I was also impressed by how well re- 
searched this lovely book-set is. The two authors, Mehra and Rechenberg, did long inter- 
views over the span of time when they worked on their book set. They had known Bohr, 
and they had many meetings with Heisenberg, Pauli, Dirac, and many more, and as I re- 
member, much material in the book set results directly from these interviews. It is good 
that the two coauthors had the interviews, as these giants in quantum theory passed on 
shortly after the book set was completed. I think the book set is a treasure of information 
on quantum theory (especially the mathematical part of it) and on the architects of the 
theory. Late in life, Dirac would always tell the physicists at conferences to look to the 
math for clues to the deep questions in physics, and he liked to use his (Dirac) equation for 
the electron as an example, stressing that he was led to it by paying attention to the beauty 
of the math, more than to the physics experiments. He was alive when I was working on 
Operator Commutation Relations, and | talked to him a few times. He told me that he was 
happy to be quoted. He used to visit his son Gabriel Dirac (graph theory) who was my 
colleague in Aarhus. 


Quotation from A.N. Kolmogorov on page 59: A.N. Kolmogorov, foundations of the Theory 
of Probability, 2d English ed., translation edited by Nathan Morrison, Chelsea, New York, 
1956, p. 1; for the German original see [Kol77]. 


Quotation from Benoit B. Mandelbrot on page 69: Benoit B. Mandelbrot, Multifractals and 
1/f noise: Wild self-affinity in physics (1963-1976), Selecta Volume N, Selected Works 
of Benoit B. Mandelbrot, Springer- Verlag, New York, 1999, p. 9. 


Quotation from Albert Einstein on page 83: These two quotes by A, Einstein are from pages 
228-9 in The New Quotable Einstein, collected from Einstein’s archives and edited by 
Alice Calaprice, Princeton University Press, 2005. In their original, they are from conver- 
sations, the second one with Einstein and Valentine Bargmann, meaning that God makes 
us believe we have understood something that in reality we are far from understanding. 


Quotation from Percy Bysshe Shelley on page 91: Percy Bysshe Shelley, “Queen Mab: A 
Philosophical Poem, with Notes,” published by the author, London, 1813. 


Quotation from Douglas Adams on page 99: Quoted from the front matter in Douglas Adams, 
Hitchhiker's Guide to the Galaxy (Mostly Harmless), Del Rey, Ballantine Books, New 
York, 2005. 


Quotation from Yves Meyer on page 109: Yves Meyer, Wavelets and functions with bounded 
variation from image processing to pure mathematics, Atti della Accademia Nazionale dei 
Lincei, Classe di Scienze Fisiche, Matematiche e Naturali, Rendiconti Lincei, Serie IX, 
Matematica e Applicazioni, vol. XI, 2000, special issue: Mathematics Towards the Third 
Millennium (Papers from the International Conference held in Rome, May 27-29, 1999), 
p. 95. 


Quotation from Oliver Heaviside on page 157: Oliver Heaviside, On operators in physical 
mathematics, part IT, Proceedings of the Royal Society of London, vol. 54, 1893, p. 121. 
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Quotation from Max Born on page 176: Jagdish Mehra and Helmut Rechenberg, The Histor- 
ical Development of Quantum Theory, vol. 3: The Formulation of Matrix Mechanics and 
Its Modifications, 1925-1926, Springer-Verlag, New York, 1982, p. 129, footnote 146. 


Quotation from Werner Heisenberg on page 179: Jagdish Mehra and Helmut Rechenberg, The 
Historical Development of Quantum Theory, vol. 3: The Formulation of Matrix Mechanics 
and Its Modifications, 1925-1926, Springer-Verlag, New York, 1982, p. 94. 


Quotation from Norbert Wiener on page 205: Norbert Wiener, J Am a Mathematician, Double- 
day, Garden City, NY, 1956; Victor Gollancz, London, 1956; The MIT Press, Cambridge, 
MA, 1964, pp. 108-109. 
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Reminder: In the symbol list and in the chapters, function spaces are defined with 
respect to various integrability conditions. For a function f on a space X, abso- 
lute integrability refers to |/|, i-e., to the absolute value of f, and to a prescribed 
(standard) measure on XY. This measure on_X is often implicitly understood, as is its 
o-algebra of measurable sets. Examples: If the space X is R?, the measure will be 
the standard d-dimensional Lebesgue measure; for the one-torus T (i.e., the circle 
group), it will be normalized Haar measure, and similarly for the d-torus T?; for 
X = Z, the measure will simply be counting measure; and for X3 (the middle-third 
Cantor set), the measure will be the corresponding Hausdorff measure h; of fractal 
dimension s = log3(2). In each case, we introduce Hilbert spaces of L?-functions, 
and the measure will be understood to be the standard one. Same convention for the 
other L?-spaces! 


A(ij,..-,in) : the cylinder set B(H) : bounded linear operators on a 
{ae Q |, =i1,...,@n =in}, Hilbert space H. 
i.e., the set of infinite strings 183, 218 
w = (m1,...) specified by 
Q1 = 1j,...,@n =ipn B : Borel o-algebra 
43, 47, 85 6, 40, 53, 204 


2: the C*-algebra of the canonical Bo : Borel a-algebra on 


anticommutation relations 115 


pees AN eee C (Q) : continuous functions on Q 


A, : family of algebras increasing in Peele tit® 


the index n, {f € C (Q) CAR : canonical anticommutation 
| f@) = f (a, @2,-++5 @n)} relations 
44, 45, 139 138-140, 154 
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C_ : the complex numbers 
25, 43, 46, 48-52, 57, 61, 140, 184, 
210, 214, 220 


C* : k-dimensional complex vector 
space 
31 


C3, C4 : Cantor sets 
195, 197-199 


D : maximal abelian subalgebra 
154 


Dy : smallest o-algebra with respect 
to which Y is measurable 
XXIV 


Dy : closed linear span 
209, 217 


e,(t) = e?*4", ey (z) = z* : Fourier 
basis functions 
61, 71-79, 130, 192, 198 
pm, (n) 


0,29 029° Ds Si insfiseontn * 
special matrix element generators 


135, 139, 183 


F : o-algebra 
37 
Fn : system of o-algebras 
37 
GMRA : generalized multiresolution 
analysis 
114 


h_ : special (harmonic) function, a 
Perron—Frobenius eigenfunction 
for Rw, a measurable function on 
X such that Ryh =h 

xxxiv, 11, 19, 49, 55, 92, 101, 105, 
116 


Kmin, Ap : minimal eigenfunction for 
Ry 


100-102, 105-107 


h3 : minimal eigenfunction corre- 
sponding to the scale-3 stretched 
Haar wavelet 


107 


h, : Hausdorff measure 
14,17 


H : some (complex) Hilbert space 
14, 17, 114-117, 131, 136, 140, 
169-170, 180-184, 189, 190, 
196, 210, 218-219 


I: identity operator or identity matrix 
(see also 14) 


115, 131, 135, 136, 139-141, 184, 
211, 214-219 


I : index set 
172, 186, 189-190 


T : multiindex 
165-168 


IFS : iterated function system 
xxxv, xliv, 5, 14, 15, 34, 35, 67, 70, 
80, 84, 99, 152, 182 


ind lim 2, : inductive limit of an 


noo 
ascending family of algebras 


139 


K : some Hilbert space 
161, 169, 170, 172, 189 


£! : all absolutely summable sequences 
66 


£7 (N), €7 (No) : all square-summable 
sequences indexed by N, or by No 
31, 140, 162, 182, 190, 193, 197 


€?(Z) : all square-summable 
sequences indexed by Z 
30, 32, 117, 136, 143, 191, 193, 
200-202, 213, 219 


£7 (X), € : all square-summable 
sequences indexed by a set _X or 
other index set 
31, 66, 143, 160, 161, 168, 170, 
172, 184, 189 


L'(R) : all absolutely integrable 
functions on R 


130 


L? (R) : all square-integrable functions 
on R 
xxxil, 4, 5, 10, 12-16, 29, 33, 65, 
71, 87, 91, 103-105, 109, 112, 
114, 129, 130, 158, 162-163, 165, 
181, 190-194, 198 


L? (R*)_: all square-integrable 
functions on R@ 
4, 22, 97, 109, 142, 229, 230 


L* (T) : all square-integrable functions 
on T 
66, 132, 136, 162, 167, 182, 190, 
192, 193, 196, 197, 210, 213, 214, 
219 


L?(.) : all square-integrable functions 
on some specified set with its 
standard measure 

14, 17, 72, 77, 79, 112, 132, 136, 
191, 196-198 
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L* (X, B, u), L* (wu): all square- 
integrable functions on the 
o-finite measure space CX, B, 2) 

31,72 


L£°(T) : all essentially bounded and 
measurable functions on T 
95, 163, 190, 191 


L® (X) : all essentially bounded and 
measurable functions on _X with 
respect to the standard measure 
and o-algebra of measurable 
subsets 

9, 43, 44, 49, 115 


MRA : multiresolution analysis 
6, 181, 194, 198 


m : function on T representing a digital 
filter 
4,10, 114 


m; : multiband filter functions 
123, 126, 190, 191, 194, 211 


mo, : low-pass filter 


111, 124-129 


m4, : high-pass filter 


(11, 124-129 


M : multiplication operator 
213 


My, = M,(C) : nx n complex 
matrices 
139 


Mon = M2 ®---@ M2 
139 
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N : the positive integers or natural 
numbers 
6, 11, 135 


o t= {0,1,2,...} = {O} UN 
5, 11, 59, 66, 85, 116, 117, 159, 
160, 164, 166, 182, 186, 188 


ONB : orthonormal basis in a Hilbert 
space 
13, 15, 16, 56, 71, 72, 76, 77, 103, 
104, 140-198 passim 


On, On, Oz : Cuntz algebra 
131, 136, 139-152, 154, 158, 


161-164, 167, 170, 176, 179-184, 
189, 190, 192, 194, 196, 203, 205, 


211, 214, 218, 219 


Py + transition probability initialized 
at x; measure on Q such that 

Py [f] = Px” [f] forall f € An 
5-11, 19, 26, 37, 43, 44, 62, 100 


Px (-|-) + conditional probability 
initialized at x 
51 


P, (No) : path-space measure of the 


natural numbers No as subset of Q 


= > Py ({o(&)}), where 


keNo 


Px (lo ()}) = 
i W (Toop * 


p=1 


* Te} (x)) i 


Tl W (29 Taam * + Ta, (X)) 


11, 18, 60, 71, 78, 86, 88-91, 100, 
102, 116 


P,(Z) +: path-space measure of the 
integers Z as subset of Q 
11, 18, 60, 64, 71, 90, 116 


P& [f] : transition probability 
initialized at x and conditioned by 
n coordinates 


= > Il W (Ta, °° * Toy (x)) ‘ 


(@1,.- sn) P=1 
-»@n), f En 


-f(@,.. 
21, 44, 45, 63, 64, 116, 122 


Pos(H) : operator with spectrum 
contained in [0, 00) 
114, 115, 117 


Rw, R : Perron—Frobenius—Ruelle 
transfer operator 


(Rv f) &) = io WY) FY) 


=x 


Xxxiv, 9, 11, 719. 26, 43, 45, 49, 
51-57, 61, 64, 66, 76, 86, 91, 95, 
100, 101, 105, 115, 116, 200 


R : the real numbers 
33, 10, 14, 195, 199 


R : envelope of a fractal 
195-199 


s : Hausdorff dimension 
14, 17, 71, 72, 77 


S:= F* 
67 


: adjoint operator 


S;, S?, T;, 7;* : the operators (isome- 
tries) and their adjoints (with 
stars) in a representation of the 
Cuntz relations (i.e., of the Cuntz 
algebra) 

131, 132, 135, 161, 181, 182, 184, 


201, 211, 213, 214, 219 


T:={zeC|lz[=1}: 
circle group, or one-torus 
=R/Z= [0,1 
25, 32, 60, 61, 190, 204 


Uz : dyadic scaling operator 
200 


V : cocycle, i.e., a measurable function 
on X x Q such that 


V (T,X; (@2, 03, ...)) = V (x; @) 
43, 49, 92 
Vo, Vi, V_ : resolution subspaces 


22, 33, 104, 111, 123-128 


V; : representation of Cuntz algebra 
180-197 passim 


W +: ameasurable function 
X > [0,1] 

xxxiv, 7-12, 17-21, 36, 41-45, 48, 
49, 51-57, 61-66, 69, 71, 76, 77, 
84-91, 101, 104, 105, 112-115, 
117, 140, 141, 162 


Wns wo : detail subspaces 
33, 123-128 


X : a fractal 
110 


X : ameasurable space 
xxxiv, 6, 7, 39, 47, 115, 117 


X, X3, X4, X4 : Cantor sets 
14, 21, 71-80, 176 


Xj (@) = w_ : coordinate functions on 
a probability space 
50 


(X, B) : aset X with a o-algebra 
B of measurable subsets 
6, 40, 84, 114, 115 


i2nt 


zize : Fourier variable 


32 
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Zn (x, @) : canonical martingale 
50, 51 
Z : the integers 
5, 19, 22, 59, 66 
Zy := {0,1} : cyclic group of order 2 
27 


Zn _ : cyclic group of order NV 
= Z/NZ= {0,1,...,N—I1} 
41, 43, 44, 49, 52, 60, 86, 88, 116, 
211, 214-219 


© : Kronecker delta function 
15, 46, 47, 104, 116, 131, 139, 163, 
164, 183, 192, 211, 216, 219 


6g : Dirac mass atx = 0 
102, 105 


es 


: Fourier frequency 
71-78, 112, 198 


A: index set for a Fourier orthonormal 
basis 
71-79, 198 


wu : the Haar measure, or other measure 
specified in the text 
14, 41, 43, 61, 72, 77, 79, 136-139, 
167, 168, 195, 198 


: multiplicity function 
114, 117 


r 


oo! : is the measure given by 


(uo!) (B) := nw (a! (B)) 
52, 72 


v : Perron—Frobenius—Ruelle measure, 
or other measure specified in the 
text 

xxxiv, 52-54, 101, 105 


256 Symbols 


p : representation or state 
47, 48, 139-141, 154 


oa : one-sided shift, an onto map 
(actually endomorphism) X > X 
such that #0 ~! ({x}) is constant 
xxxiv, 6-8, 12, 17, 41, 45, 47, 
51-54, 62, 64, 71, 74-76, 84, 
89-91, 101, 114, 115, 159, 160, 
170-172, 184, 186, 188 


o® : shift onQ 
47, 51, 52 


o~!(B) : pre-image under the map- 
pingo := {x eX | oa (x) e€ B} 
6, 41, 72, 52, 84 


(o a)! : pre-image under the mapping 
Q 
oO 


52 


1],.-.,TN—1 : branches of o~!, maps 
X > X such that o o 7; = idy 
7, 41, 47, 52, 72, 89, 115, 159 


ce : branches of (02) 


47, 48, 52 


g : scaling function 
3, 10, 12, 13, 15, 23, 102, 103, 114, 
134 


20; 91, 92,-.. + wavelet packet system 
112, 113, 118-122, 168, 191 


xy: characteristic function 
14, 16, 47 


y : wavelet function 
13, 16, 23, 102, 103, 134 


: wavelets 
15 


Wn,k 


a (k) : representation in Q of 

ke No Ifk = 

wo + @2N +--+ + @,N"! 

is the Euclid N-adic representa- 

tion, @ (k) := 

(@1, ..., @n, 0, 0, 0, ...) 
— ace? 


co string of zeroes 
11, 18, 77, 79, 92, 101, 122, 130, 
135, 137, 138, 140 


Q : probability space 
= {0,1,...,N—1}§ 
=[]{0,1,...,N —1} 


all functions: 
N- {0,1,...,N—-1]} 
= {(@1, @,...) 
|a; € {0,1,...,N—1}} 
5, 7, 11, 18, 20-37, 43, 46, 47, 49, 
69, 85, 135 


(Q, B,v) : probability space 
56, 203 


0 : one-sided infinite string of zeroes 
= (0, 0, 0, ...)e€Q 


oo string of zeroes 


12, 85, 116, 130 


{0} : the set with the one element 0 
12, 85, 116 


14, : identity operator (see 
also J) 
114, 115, 117, 122, 160, 161, 181, 
182, 184 


1 : constant function equal to 1 
46, 61, 64, 136 


*-algebra, *-isomorphism 
221 


*-automorphism 
211 


ty 


S 


: lattice operation applied to closed 
subspaces in a Hilbert space: the 
lim sup lattice operation 

169, 181 


: lattice operation applied to closed 
subspaces in a Hilbert space: the 
lim inf lattice operation 

169, 181 


: empty set 
171, 172, 185, 


: closure of a set E 
44,172 


: Fourier transform (of the scaling 


function g) 
10, 114, 111 


: relatively absolutely continuous 
(relation between measures) 
50, 53 


> up-sampling 
124, 132, 213, 214 


: down-sampling 
124-128, 132, 133, 212, 213, 215 


: direct (orthogonal) sum 
112, 172, 218 
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® : tensor product 
139, 158, 161-163, 165, 170-172, 
180-183, 189, 190, 193, 194, 197 


© : relative orthogonal complement 
169 


x : Cartesian product 
11, 43, 49, 52, 88, 117, 130, 164, 
166, 185, 188, 218 


# : counting function 
6, 41, 72, 101, 159, 184, 196 


(. | +) + inner product 
16, 75, 77, 79, 104, 114, 140 


|-) : Dirac vector 
160-163, 171, 182, 184, 185, 189, 
193, 194 


[-, -) : interval closed to the left and 
open to the right 
41, 61, 63, 65, 136-138, 165, 166, 
167, 192 


[-, +) : segment of No 
165-167, 186, 188 


[-, -] : interval closed at both ends 
7, 11, 13, 16-18, 47, 62-66, 71, 77, 
84, 89-92, 102, 105, 112, 125, 
130, 135-139, 195 


Index 


Comments on the use of the index: Some terms in the index may appear in the 
text in a slight variant, or variation of the actual index-term itself. For example, we 
will have terms in the index referring to “theorem so and so.” But when we use the 
Stone—WeierstraB theorem, I just say Stone—-Weierstra8. The word “theorem” will be 
suppressed. It is implicitly understood. ‘ 


Similarly, I often just say, “by domination” (or some variant thereof), when I mean, 
“by an application of the dominated convergence theorem,” or more fully: “By 
Lebesgue’s dominated convergence theorem.” It will be the same theorem whether 
the name is abbreviated or not. 


For Fubini, the word “theorem” may be implicitly understood. Guido Weiss has made 
a verb out of it: “Fubinate” means “to exchange the order of two integrals.” 


Similarly, the name Fatou often is used to mean “Fatou’s lemma” (the one about 
lim inf). For some reason poor Fatou only got credit for a lemma. But I do not mind 
upgrading him to a theorem, although “Fatou’s theorem” usually refers to the one 
about existence a.e. of boundary values of bounded harmonic functions. I usually 
call that one “the Fatou-Primalov theorem.” 


A-random variable, see random variable, A- — iterated function system, 5, 14, 15, 25, 
abelian, see algebra, abelian; group, abelian; 67, 70-72, 80, 81 

maximal abelian subalgebra -~ iteration, 21 
absolutely continuous, see measure, — map, ae 1, 2,93 81, 195 

absolutely continuous self-— tiling, see tiling, self-affine 
adjoint, see matrix, adjoint; operator, adjoint see Ret TAS eee Taine atte 

wavelet 
a.e. convergence, see convergence, a.e. algebra, xvii, xx, 1, 3, 27, 43, 44, 47, 54, 
affine 140, 176, 251, 252 
— fractal, xxiii, 3, 5, 15, 22, 26, 70-72, abelian, 110, 138, 139 


77, 80, 180, 194, 198 C*-, see C*-algebra 
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CAR-, see CAR-algebra 
Cuntz, see Cuntz algebra 
Cuntz—Krieger, see Cuntz—Krieger algebra 
dense sub-, 46 
fermion, see fermion algebra 
matrix, 139, 210 
maximal abelian sub-, see maximal 
abelian subalgebra 
non-abelian, xxviii, 138, 139, 155, 218 
operator, see operator algebra 
o-, see o-algebra 
*-, 221 
sub-, 44, 139, 154, 155 
algebraic structures 
representations of, see representation of 
algebraic structure 
algorithm, xvi, xix, Xx, Xxv, XXVi, xxxiii, 
XxXiv, XxxVi, xliii, 3, 4, 10, 33, 35, 
124, 125, 128, 147, 148, 164, 206, 
210, 211, 215, 223-226, 230, 231 
cascade, see cascade 
Euclid’s, xx, 11, 40, 63, 69, 85, 92, 164, 
166 
Gram-Schmidt, xv, xxvi 
matrix, xxvi, xxviii, xxxii, 142, 147, 
157, 226, 230, see also matrix step in 
algorithm 
pyramid, vi, xx, xxxiv, xxxv, xlv, 33, 
111-113, 122, 123, 125, 128, 129, 
134, 148, 157-159, 182, 225, 227, see 
also pyramid 
recursive, XXVii, XxXii, XXXiv, Xxxv, xlv, 6, 
109, 147, 157, 226 
subdivision, xxix, Xxxi, Xxxii, xxxv, 124, 
142, 227 
wavelet, xiii, xvi, xix, Xxvii-xxix, 
xxxi-xxxiii, 25, 33, 110, 123, 125, 
133, 142, 147, 148, 151, 152, 156, 
206, 210, 215, 226, 227, 230 
N-adic, 125, 126 
wavelet-like, xxv 
wavelet packet, xxxiv, xxxv, 3, 34, 123, 
125, 126 
alias, 229, 230 
ambient 


— Euclidean space, see space, Euclidean, 
ambient 
— function space, see space, function, 
ambient 
—— Hausdorff measure, see measure, 
Hausdorff, ambient 
— Hilbert space, see space, Hilbert, 
ambient 
analysis 
data, xxv 
(engineering), xvi, xxii, xxvi, 124, 132, 
148, 205-207, 214, 227, see also 
frequency band; perfect reconstruc- 
tion; synthesis; signal analysis; signal 
processing 
Fourier, see Fourier analysis 
fractal, see fractal analysis 
harmonic, see harmonic analysis 
(mathematics), xvii, xxvii-xxxi, xxxiii, 
xliv, 3, 6, 9, 22, 26, 33-35, 37, 39, 
59, 80, 81, 84, 87, 98, 206, see also 
Fourier analysis; harmonic analysis; 
spectral analysis 
multiresolution, see multiresolution 
analysis 
numerical, xxvii, xxxii, 229 
spectral, see spectral analysis 
stochastic, 35 
wavelet, see wavelet analysis 
approximation, xxvi, 4, 6, 79, 90, 107, 109, 
158, 226 
cascade, see cascade approximation 
— theory, xxix 
atomic, see measure, atomic 
attractor, 34, 187 
automorphism 
«-, 211 


B-measurable, see measurable, B- 

B-measure, see measure, B- 

band-limited wavelet, see wavelet, 
band-limited 

base-point representation, see representation, 
base-point 

bases, see basis 


basis, xv, xvi, XX, Xxvi, xxxi, XxXxvi, xliv, 2, 
5, 9, 22, 30, 59, 67, 70, 71, 87, 111, 
143, 146, 148, 157, 162, 179, 182, 
184, 228, 229 
bi-orthogonal, 143 
canonical, 143, 160, 202 
dual, 143, 144 
Fourier, xvi, 67, 144, 158, 168, 193, 252, 
255 
fractal, 21, 26, 70-72, 79 
frame, 28, 29, 143, 229 
— function, xv, xxxvi, 104, 106, 112, 113, 
129, 166 
localized, 67, 80, 157, 158, see also 
localization property of wavelet bases 
orthogonal, xxvi, 36, 69, 70, 87 
orthonormal, xxxi, 13, 15, 16, 22, 26, 28, 
29, 32, 36, 55, 56, 65, 71, 72, 74, 76, 
77, 79, 99, 103-106, 130, 139, 140, 
143, 144, 149, 150, 162, 163, 165, 
166, 168, 177, 182, 184, 185, 189, 
190, 192-194, 197, 198, 228, 229, 
254, 255 
Parseval, 103 
permutation of, 182 
recursive, 179 
— transformation, 166 
wavelet, xvi, xix, xxvi, xxxiii, xliii, xliv, 
13, 15, 22, 23, 67, 72, 80, 87, 99, 103, 
142, 176, 179, 180, 187-189, 208, 
229, see also localization 
dyadic, 99, 202 
fractal, 180 
wavelet-like, 109 
Bernoulli product measure, see measure, 
p-Bernoulli-product 
Bethe lattice, see lattice, Bethe 
bi-orthogonal, see basis, bi-orthogonal 
black box, xx, xxi 
Borel 
— cross section, 183 
— measure, see measure, Borel 
— o-algebra, see o-algebra, Borel 
— subsets, 135, 167, 199 
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M. Born, 176, 205, 236 
boundary 
—~ for harmonic function, see harmonic 
function, boundary for 
— representation, 19, 48 
— value, 21, 43, 48 
branch mapping, 256 
measurable, 41 
n-fold, 184, 185, 188, 189 
2-fold, 186, 187 
branching, 5, 158, 161, see also random 
walk on branches 
dyadic — system, 160 
— system, 159, 171 
C. Brislawn, xxxiii, xliv, 22, 234 
Brownian motion, xxvi, xxvii, 56, 57 
fractal, xxiii 
fractional, xxvii, 57 
s-fractal, xxiii 


C*-algebra, xxix, 5, 6, 131, 138-140, 142, 
151, 154, 155, 183, 211, 251 
canonical anticommutation relations, 
5, 138-140, 154, 251, see also 
CAR-algebra 
canonical basis vector, see basis, canonical 
Cantor, 1, 69, 179 
— construction, 69, 70, 74 
— group, see group, Cantor 
— measure, see measure, Cantor 
—’s example, see measure, Cantor 
— scaling identity, see scaling identity, 
Cantor 
— set, 2, 5, 15, 71-73, 252, 255 
conjugate, 73, 75-77 
duality for —s, 69 
middle-third, 2, 5, 14, 15, 21, 25, 27, 
69-71, 73, 74, 80, 176, 189, 195, 251 
quarter, 5, 26, 72-77, 79, 198 
scale-4, see Cantor set, quarter 
CAR-algebra, 5, 138, 139, 154, 155 
representations of, see representation of 
CAR-algebra 
L. Carleson, 32 
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cascade, 9, 130, see also closed subspaces, 
nested family of 
— approximant, 134 
— approximation, 4 
Cauchy product, 87, 206 
closed linear span, 15, 169, 200, 207, 209, 
252 
closed subspaces 
nested family of, xviii-xx, xxii, xxix, 9, 
33 
cocycle, 43, 48-52, 91, 92, 255 
— identity, 19, 20 
— property, 49 
coefficient, 168 
autocorrelation, 104 
filter, 4 
Fourier, xvi, xxii, 87, 97 
masking, 4, 10, 16, 23, 87, 91, 114, 123, 
130, 131, 135, 227 
matrix, 131 
operator, 114 
response, 10 
wavelet, xxvi, 23, 25, 202, 225, 227, 230 
wavepacket, 168 
A. Cohen, xxxii, xxxiii, 33, 87 
co-isometry, 162, 216 
combinatorial 
— probability theory, 5, 154 
— tree, 6, 111, 129 
commute, 94, 135, 234, 235, see also 
non-commutative setting 
compact, 1, 8, 14, 25, 27, 35, 43, 69, 71, 83, 
92, 98, 149, 195, 204 
— abelian group, see group, abelian, 
compact 
—~ Hausdorff space, see space, compact 
Hausdorff 
—— operator, see operator, compact 
— support, see wavelet, compactly 
supported 
conditional expectation, 10, 57 
conjugate Cantor set, see Cantor set, 
conjugate 
conjugation, 150 


ad baie dike Ove ae ne ee decision tree 


consistency, xx, 80, 122, see also 
Kolmogorov consistency 
continued fractions, 41 
convergence, xxviii, xxx, 4, 6, 9, 10, 18, 80 
a.e., 32, 50, 56, 92, 95, 96 
dominated, 78, 89 
dominated — theorem, see theorem, 
dominated convergence 
martingale — theorem, see theorem, 
martingale convergence 
— of infinite product, 5, 8, 11, 17-19, 21, 
60, 85 
pointwise, 4, 5, 17-19 
countable family of o-algebras, see 
o-algebras, countable family of 
J. Cuntz, xxii, 183 
Cuntz 
— algebra, xxix, 5, 6, 22, 131, 136, 152, 
154, 155, 158, 161, 162, 179-183, 
187, 196, 205, 208, 210, 222, 254, 
255, see also representation of Cuntz 
algebra 
—-Krieger algebra, xxix 
— relations, xxii, 6, 132, 155, 160, 161, 
174, 179-182, 201, 203, 211, 214, 
216, 219, 221, 222, 227, 231, 254, see 
also representation of Cuntz algebra 
— representation, see representation of 
Cuntz algebra 
— system, 208, 221 
cycle, 26, 87 
cyclic group, see group, cyclic 
cylinder set, 43, 47, 78, 85, 115, 139, 251 


data mining, xvii, xix, Xxiv, xxv, xxviii, 107, 
224 
I. Daubechies, xxxii, xxxiii, 10, 33, 87 
Daubechies 
— scaling function, see scaling function, 
Daubechies 
— wavelet, see wavelet, Daubechies 
decision tree, xxxv 
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decomposition, 166 
Karhunen—Loéve, see theorem, 
Karhunen—Loéve 
orthogonal, 172 
Schmidt’s, see theorem, Schmidt’s 
wavelet, xxxii, 190 
Wold, 169 
derivative 
Radon-Nikodym, 50, 53 
detail space, see space, detail 
differentiability, 6 
differentiable, xxxii-xxxiv, 14, 135 
dimension, 4, 33, 117, 154, 210 
fractal, 14, 195, 251 
Hausdorff, xxiii, 2, 71, 72, 74, 77, 176, 
195, 251, 254 
scaling, 72 
Dini regularity, see regularity, Dini 
G. Dirac, 235 
P.A.M. Dirac, 58, 234, 235 
Dirac 
— mass, 26, 102, 255 
— notation, 55, 58, 182, 186, 257 
discrete wavelet transform, see wavelet 
transform, discrete 
distribution, xxiii, 130, 142, 168, 204 
Gaussian, 56, 203, see also random 
variable, Gaussian 
D. Donoho, xxxiii 
J. Doob, xxiv, xxvi 
down-sampling, see sampling, down- 
dual 
— basis, see basis, dual 
— filter, see filter, dual 
Fourier, see Fourier dual 
— high-pass filter, see filter, dual 
high-pass 
— lattice, see lattice, dual 
— low-pass filter, see filter, dual low-pass 
— variable, xxi, 206, 212 
— wavelet, see wavelet, dual 
duality, 69, 205, 207, 212 
— for Cantor sets, see Cantor set, duality 
for 
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Fourier, 35, 60, 81, 207, 209, 210 
particle-wave, 131 
time-frequency, 207, 212 
D. Dutkay, xliii, 57, 72, 87, 97, 194 
dyadic 
— branching system, see branching, 
dyadic — system 
— fractional subinterval, 166-168 
— Haar wavelet, see wavelet, Haar, 
dyadic 
— pyramid, see pyramid, dyadic 
— rationals, 66, 90, 167 
— representation, 166 
— scaling, see scaling, dyadic 
— subdivision, see subdivision, dyadic 
— tiles, see tiling, dyadic 
— wavelet, see wavelet, dyadic 
— wavelet packet, see wavelet packet, 
dyadic 
dynamics, xxix, xxx, xxxvi, xliv, 9, 34, 37, 
42, 168 
complex, 25, 34 
symbolic, xxxv, 34, 182 


eigenfunction, 19 
minimal, 15, 19, 99-102, 105, 252 
Perron—Frobenius, 26, 97, 252 
eigenspace, 19 
Perron—Frobenius, 107 
eigenvalue, 9, 11, 19, 77, 100, 107, 116 
endomorphism, xxxiv, 4, 52, 91, 101, 184, 
256 
engineering, xiii, xxviii, xxxi, xxxii, 
XXXiV—-xXxxVI, xliv, 17, 87, 88, 124, 
204, 210, 212, 215, 227, 228, 230 
equivalence 
— class, 172, 183 
— relation, 172 
ergodic theory, xliv, 84 
ergodicity, 155 
Euclidean algorithm, see algorithm, Euclid’s 
Euclid’s algorithm, see algorithm, Euclid’s 
expansion, xxxi, 168 
Fourier, see Fourier expansion 
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N-adic, 11 
orthogonal, xxix 
wavelet, 23, 25 
extension 
unitary matrix, 107 
unitary — principle, see unitary extension 
principle 


factorization, 34 
matrix, 210 
— of unitary operators, 158, 180 
operator, 216 
Schmidt’s, see theorem, Schmidt’s 
tensor, 168, 170, 181, 186, 190, 194, 198 
Farey tree, 41, 42 
father function, see function, father 
Fatou 
— —Primalov theorem, see theorem, 
Fatou—Primalov 
—’s lemma, see theorem, “Fatou’s 
lemma” 
— set, 25 
fermion, 139, 154 
— algebra, 154 
filter, xxxiii, 4, 87, 124, 133, 205, 206, 210, 
213, 215, 227, 231, 253 
dual, 124, 205 
dual high-pass, 124, 132 
dual low-pass, 124, 132 
high-pass, 23, 87, 111, 124, 125, 132, 154, 
205, 253 
low-pass, Xxxiv, 4, 23, 37, 87, 104, 111, 
124, 125, 132, 154, 204, 205, 253 
— orthogonality, see orthogonality, filter 
quadrature-mirror, 3, 4, 10, 23, 40, 87, 
135, 168, 186, 205, 212, 217, 227, 228 
subband, xxix, xxxiv, 23, 26, 87, 123, 124, 
126, 205, 210, 211, 213, 215, 228 
wavelet, see wavelet filter 
fixed-point problem, 10 
four-tap, vi, 22, 134, 135, 146, 147, 202, 228 
Jean Baptiste Joseph Fourier, xxxi 
Fourier 
— analysis, xv, xxviii, 2, 30, 225, 226 
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— basis, see basis, Fourier 
— coefficient, see coefficient, Fourier 
— correspondence, 67 
— dual, xxi, xxx, 35, 60, 81, 198, 207, 
209, 210 
— pair, xxi, 36 
— equivalence, 110 
— expansion, xxii, 61, 157, 158 
— frequency, 21, 26, 69, 70, 110, 255 
Mock — series, 80 
— series, XXi, Xxii, xxvii, xxxi, 10, 32, 59, 
67, 69, 80, 145, 163, 206 
— transform, xxi, 4, 10, 35, 102, 105, 
111, 114, 191, 208, 257 
inverse, xxii 
— variable, 255, see also dual variable 
fractal, xxviii, xxix, xxxvi, 7, 15, 21, 22, 34, 
35, 37, 67, 72, 74, 77, 80, 97, 152, 182, 
194, 198, 210, 211, 227, 231, 254, 255 
affine, see affine fractal 
— analysis, xxxv, xliv, 6, 36, 60, 98, 210 
— dimension, see dimension, fractal 
— Hilbert space, see space, Hilbert, 
fractal 
— measure, see measure, fractal 
— theory, xxix, 25 
— wavelet, see wavelet, fractal 
fractions 
N-adic, 90, 92 
2-adic, 89, 138 
frame, xliii, 104, 126, 143, 176, 228-230 
affine wavelet, 230 
— bound, 29, 143, 207, 228 
— constant, see frame bound 
normalized tight, 16, 99, 104, 106, 176, 
see also frame, Parseval 
— operator, 207 
Parseval, 13, 16, 99, 104-106, 162, 228 
super-, 230 
tight, 162, 228 
— wavelet, 100, 229 
frequency, xxxi, xxxv, 23, 123, 125, 131, 
204, 206, 207, 212, 213, 228 
— band, xx, 87, 124, 177, 181, 210 
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— domain, 4, 87 
—-localized wavelet, see wavelet, 
frequency-localized 


— response, 4, 17, 87, 131, 204, 206, 207, 


210, 227 
— subband, 123 
Frobenius, xliii, see also eigenfunction, Per- 
ron—Frobenius; eigenspace, Perron— 
Frobenius; matrix, Perron—Frobenius; 
operator, Perron—Frobenius—Ruelle; 
Perron—Frobenius—Ruelle theory; 
theorem, Perron—Frobenius 
Fubini’s theorem, see theorem, Fubini’s 
function 
basis, see basis function 
bounded continuous, 14, 195 
bounded measurable, 43, 47, 49, 51, 53, 
61, 115, 219, 253 
constant, 15, 46, 61, 91, 136, 256 
continuous, 7, 43, 105, 251 
eigen, see eigenfunction 
father, xxx, 13, 102, 123-125, 128, 134, 
135 
filter, 4, 87, 104, 114, 123, 132, 192, 196, 
207, 227, 253 
filter response, 87 
frequency response, see frequency 
response 
generating, 206 
harmonic, see harmonic function 
indicator, 14, 52 
iterated — system, see iterated function 
system 
L?-, 13, 14, 111, 190, 210, 251 
limit, 50, 54 
Lipschitz, see Lipschitz function 


matrix, 4, 22, 112, 117, 140, 141, 214, 215 


unitary, 227 
measurable, xxxiv, 7, 47, 54, 60-62, 65, 
70, 84, 87, 115, 199, 252, 255 
mother, xxx, 13, 102, 123-125, 128, 134, 
135 
multiplicity, see multiplicity function 
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1-periodic, 18, 60-62, 66, 95-97, 104, 
132, 174, 175, 217 

operator, 114, 116, 117 

unitary, xxix 

periodic, 4, 69, 87, 204, 208, see also 
function, 1-periodic 

positive definite, 57, 203, 204 

rational, 34 

refinable, 107 

scaling, 3, see scaling function 

— space, see space, function 

square-integrable, 210, 253 

step, Xxxi 

— theory, xxviii, xliv, 46 

time-localized, xxx, 177 

vector-valued, 131, 210 

W-, xxxiv, 7, 10, 19, 36, 37, 54, 101, 112 

wavelet, see wavelet function 

zeta, 34 


D. Gabor, xxxi 
gap-filling, 14, see also wavelet, gap-filling 
generalized multiresolution, xxv, 22, 110, 
180, 181, 252 

— analysis, 109, 114 
GMRA, see generalized multiresolution 
grayscale, xviii, 22, 147, 148, 152, 207, 227 
A. Grossman, xxxii 
group, 1, 5, 28, 36, 70, 109, 110, 205, 210, 

211, 214, 221 
abelian, 27, 69 
compact, 27 

Cantor, 28 

circle, 251, 254 

cyclic, 60, 174, 255 

infinite-dimensional unitary, 155 

Lie, 220 

non-abelian, xxvi 

renormalization, xxix 

sub-, 230 

torus, 204 

transformation, 211 
R. Gundy, xxxiii, xliii, 6, 33, 87 


A. Haar, xvi, xxvi, xxxi, 131, 223 


Haar 
dyadic — wavelet, see wavelet, Haar, 
dyadic 
— measure, see measure, Haar 
— wavelet, see wavelet, Haar 
harmonic 
— analysis, xix, xxviii-xxx, xliii, 2, 5, 19, 
22, 25, 26, 33, 60, 80, 87, 182, 229 
discrete, 35 
— of iterated function systems, xxx, 14, 
67, 80 
— function, 9, 18, 21, 22, 43, 48-52, 55, 
76, 86, 91, 92, 95, 100, 252 
boundary for, 21, 43, 48, 50 
bounded, 43, 48-50 
closed expression for, 15, 102, 105 
construction of, 100 
integral formula for, 50 
minimal, 11, 22, 105 
Py-, 100 
R-, xxxiv, 43, 50, 79, 91 
Hausdorff 
— dimension, see dimension, Hausdorff 
— measure, see measure, Hausdorff 
O. Heaviside, xvi, 157, 223, 235 
W. Heisenberg, 58, 131, 176, 179, 235, 236 
hermitian operator, see operator, hermitian 
high-pass filter, see filter, high-pass 
D. Hilbert, xxxi, 205 
Hilbert space, see space, Hilbert 
Hutchinson measure, see measure, 
Hutchinson 


image processing, vi-—viii, xv, xix, Xx, XXii, 
XXV1, XXViil, XXXli-XxXvii, xliv, 6, 10, 
23, 33, 40, 87, 142, 147, 148, 151, 
152, 189, 206, 223-227, 230, 233, 
235, see also signal processing 

infinite-dimensional unitary group, see 
group, infinite-dimensional unitary 

infinite product, xxvili-xxx, 4, 5, 7, 8, 
10-12, 17-19, 21, 27, 33-35, 60, 67, 
83-85, 96, 97, 116, 138, 154, see also 
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measure, infinite-product; Tychonoff 
infinite-product topology 
convergence of, see convergence of 
infinite product 
matrix, 4, 22 
random, 22, 34 
tensor, 139, 148, 149, 158 
integers, 5, 21, 22, 37, 60, 71, 166, 254, 255 
integral translates, 18, 104, 181 
intermediate differences, 147, 148 
intertwining, 169, 200, 202, 209 
interval, unit, see unit interval 
invariant, 51, 52, 66 
— measure, see measure, invariant 
R-, 53 
shift-, 51, 52, 92 
o-, 52, 53 
—- subspace, 109, 110, 146, 202, 209, 221 
translation-, 129 
C.T. Ionescu Tulcea, 57 
irreducible representation, see representa- 
tion, irreducible 
isometry, xix, 32, 67, 93-95, 174, 176, 
182-185, 200, 201, 208, 216, 221, 
222, 229, 254 
partial, 94, 150, 173 
isomorphism, 47, 193, 194, 221 
C*-algebraic, 139 
isometric, 30, 112, 149, 207 
order-, 51 
x-, 221 
unitary, 162, 169, 191, 193, 194 
iterated function system, xxx, xxxv, xliv, 34, 
35, 47, 57, 67, 70, 84, 99, 152, 182, 
252, see also affine iterated function 
system 


JPEG 2000, xxxiii 


Karhunen—Loéve decomposition theorem, 
see theorem, Karhunen—Loéve 

A.N. Kolmogorov, xxvi, xxxi, 7, 8, 39, 43, 
46, 59, 84, 168, 170, 203, 204, 235 
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Kolmogorov 
— consistency, xxxi, 7, 45, 46, 48, 49, 
141, 203 
— extension, xxix, 21, 46, 48, 57, 97, 136, 
139, 151 
—’s lemma, see theorem, “Kolmogorov’s 
Jemma” 
—’s 0-1 law, xxxiii, 37 
Krieger 
Cuntz-— algebra, see Cuntz—Krieger 
algebra 


L + -normalization, 12 
L?-normalization, 12 
£?-sequence, 140 
lacunary trigonometric series, 84 
lattice, 4, 154 
Bethe, 98 
dual, 28, 36, 69 
— operation, 169, 181, 257 
— system, see statistical mechanics, 
quantum 
W. Lawton, xxxiii, xliii, xliv, 21, 33, 57, 87 
Lebesgue 
— measure, see measure, Lebesgue 
—’s dominated convergence theorem, see 
theorem, dominated convergence 
limit, 20, 21, 50, 75, 76, 92, 154, 212 
exchange of —s, 89, 101 
— function, see function, limit 
inductive, 139, 140, 252 
martingale, 49 
non-tangential, 50 
Szego’s — theorem, see theorem, Szego’s 
limit 
Lipschitz 
— continuous, 77 
— function, xxxiv, 57, 191 
— regularity, see regularity, Lipschitz 
localization, 22, 158, see also basis, 
localized; function, time-localized; 
wavelet, frequency-localized 
— of Mock Fourier series, 80 


— property of wavelet bases, 80, 87, 157, 
225, 226 
low-pass 
— filter, see filter, low-pass 
low-pass 
— condition, 17, 228 
— property, 125, 130 


S. Mallat, xxxii, 33, 168 
Mallat subdivision, xxxi 
Markov 
— chain, 50 
— process, 21 
— transition measure, see measure, 
transition 
martingale, xxix, xxx, xxxili, 10, 21, 35, 36, 
50, 51, 57, 91, 168, 255 
— convergence theorem, see theorem, 
martingale convergence 
— limit, see limit, martingale 
masking coefficient, see coefficient, masking 
Mathematica, 35 
graphics produced using, xxxiv, xliii, 2, 
123, 125, 129, 152 
matrix, 129, 139, 141, 182, 183, 190, 191, 
210, 215, 216, 230, 252, 253 
adjoint, 147, 210, 214 
— algebra, see algebra, matrix 
— algorithm, see algorithm, matrix 
— coefficient, see coefficient, matrix 
diagonal, 140 
— diagonal, 141 
— element, 140, 154 
— entry, 219 
— factorization, see factorization, matrix 
function, see function, matrix 
identity, 183, 252 
infinite, xxvii, 142, 231 
infinite — product, see infinite product, 
matrix 
integral, 3 
— multiplication, 25, 128, 142, 143, 202, 
210, 216, 230 
— operation, 124 
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matrix 


operator, 210, 214, 215, 218 
Perron—Frobenius, 34 
polyphase, 205, 210, 215, 218 
positive, 34 
positive definite, 203 
positive semidefinite, 4 
— product, xxvi, 25, 218, 219, see also 
infinite product, matrix 
propagator, 34 
— representation, see representation, 
matrix 
slanted, 23-25, 142, 143, 145, 146, 202, 
206, 225, 230, 231 
sparse, 23, 206 
— step in algorithm, xxxv, 206, 207, 210 
sub-, 139 
— theory, 34 
Toeplitz, 23, 146 
— unit, 135 
unitary, 109, 129, 174, 182, 191, 201, 
205, 210, 211, 214, 216-218, 220, 
221, 227, see also extension, unitary 
matrix; function, matrix, unitary 
—-valued 
function, see function, matrix 
measure, see measure, matrix-valued 
wavelet filter, see wavelet filter, 
matrix-valued 
wavelet, 206, 225 
maximal abelian subalgebra, 154, 252 
measurable, xxiv, xxxiv, 5, 6, 40, 41, 47, 
115, 199, 252, 255 
B-, 41, 53 
branch, see branch mapping, measurable 
— space, xxxiv 
measure, xxiv, xxvii, 1, 5, 26, 32, 36, 39, 72, 
78, 85, 101, 136, 139-141, 154, 166, 
168, 204, 251, 253-255, 257 
absolutely continuous, 50, 136, 154, 257 
atomic, 100 
B-, 53 
Bernoulli-product, see measure, 
p-Bernoulli-product 
Borel, xxxiv, 9, 46, 80, 141, 149, 154 
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Cantor, 12, 14, 74, 198 
determinantal, 5, 138, 154, 155 
Dirac, see Dirac mass 
equivalent, 154 
— extension, 18, 21, 91, 116, 139, 141, 
167, see also Kolmogorov extension 
Feynman, 34 
fractal, xxiii, 2, 3, 14, 70, 74, 77 
full, 5, 22, 71, 79 
Haar, xxvi, 28, 61, 66, 70, 93, 175, 212, 
251, 255 
Hausdorff, xxviii, 2, 14, 17, 72, 74, 97, 
176, 196, 198, 199, 251, 252 
ambient, 72 
Hutchinson, 48 
infinite-product, 149, 154 
invariant, 52, 53, 70 
Lebesgue, xxviii, 1, 3, 14, 15, 22, 32, 36, 
55, 94, 106, 136, 137, 175, 251 
matrix-valued, 111 
non-atomic, 41, 91 
operator-valued, 114, 115, 117 
orthogonal, 167 
p-Bernoulli-product, 27 
path-space, xxviii, xxx, xxxi, 1, 4-9, 11, 
18, 19, 21, 26, 34, 35, 37, 43, 45, 51, 
57, 59, 60, 65, 70, 71, 77, 79, 84, 91, 
98, 100, 111, 115, 122, 130, 254 
Perron—Frobenius, see measure, 
Perron—Frobenius—Ruelle 
Perron—Frobenius—Ruelle, 26, 255 
Poisson, 50 
positive, xxxiv, 4, 36, 44, 46, 51, 115, 141 
probability, xviii, xxxiv, 14, 37, 41, 44, 
46, 53, 54, 59, 70, 80, 86, 94, 100, 
115, 117, 122, 149, 167, 195, 204 
Borel, see measure, Borel 
Radon, see measure, Radon 
s-fractal, xxiii 
product, xxvii, 91, 149, 157 
projection-valued, 112, 135, 136, 142, 
166, 167 
Radon, 43, 44, 46, 49, 85, 86, 115-117, 
122 
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Ruelle, see measure, Perron—Frobenius— 
Ruelle 
o-additive, 115, 167 
— space, 6, 40, 84, 94, 115, 139, 149 
o-finite, 31, 253 
spectral, 112 
— theory, viii, 5, 6, 20, 22, 70, 195, 204 
transition, xxxiv, 21, 90 
W-, xxxiv, 9, 26 
Y. Meyer, xxxii, 33, 176 
middle-third Cantor set, see Cantor set, 
middle-third 
minimal, 11, 171, 172 
— eigenfunction, see eigenfunction, 
minimal 
mirror, 33, 212, 228 
quadrature, see filter, quadrature-mirror 
monotonicity, 92, 155 
J. Morlet, xxxii 
mother function, see function, mother 
MRA, see multiresolution analysis 
multiindex, 165-167, 252 
multiplicity, 111 
— function, 114, 117 
multiplicity function, 255 
multiresolution, xviii-xx, xxvi, xxxi, 
xxxii, 9, 10, 35, 36, 59, 110, 114, 153, 
168-171, 179, 187, 189, 198, 222, 223, 
see also generalized multiresolution; 
wavelet, multiresolution 
— analysis, 6, 16, 36, 109, 181, 194, 198, 
252, 253 
orthogonal, 172 
— wavelet, see wavelet, multiresolution 
multiwavelet, 5, 7, 111, 114, 116, 153 


N-adic, 126 
— map, 90 
— rationals, 92 
— subinterval, 47 
n-fold branch mapping, see branch mapping, 
n-fold 
natural numbers, xviii, 5, 21, 40, 52, 66, 71, 
85, 167, 254 
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Nikodym 
Radon——~ derivative, see derivative, 
Radon-Nikodym 
non-abelian, see algebra, non-abelian; group, 
non-abelian 
non-atomic, see measure, non-atomic 
non-commutative setting, xv, xxvi, xxix, 
5, 7, 37, 58, 115, 139, 151, 177, 
234, see also canonical anticom- 
mutation relations; probability, 
non-commutative 
non-overlapping, 187, 199 
— partition, see partition, non-overlapping 
norm, xviii, 12, 14, 17, 31, 44, 48, 139, 168, 
173, 181, 204, 210, 228 
normalization, 12, 14, 15, 37, 53, 54, 61, 64, 
70, 101, 114, 125, 127, 135 
normalized solution, 12, 14 
notational convention, 17, 204 


ONB, see basis, orthonormal 
one-torus, 60, 251, 254 
operator, 132, 140, 141, 158, 160, 162, 172, 
192, 196, 210-212, 215, 216, 219, 254 
adjoint, 93, 144, 145, 147, 157, 160, 165, 
185, 200, 209, 210, 212-214, 216, 
217, 254 
— algebra, xxviii-xxx, xxxvi, 6, 22, 37, 
138, 154, 155, 205, 210, 211, 216, 218 
bounded, 218 
bounded linear, 218, 251 
— coefficient, see coefficient, operator 
compact, 55, 58 
composition of —s, 210, 216 
conjugation, see conjugation 
— factorization, see factorization, 
operator 
filter, 135 
subband-, 213 
frame, see frame operator 
— function, see function, operator 
hermitian, 209 
Hilbert space, see space, Hilbert, operators 
in 
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identity, 131, 136, 174, 182, 209, 252, 256 


linear, 140 
— matrix, see matrix, operator 
-— monomial, 165 
multi-, 165 
multiplication, 94, 95, 110, 142, 147, 198, 
209, 210, 213, 217, 221, 253 
non-commuting, see non-commutative 
setting 
Perron—Frobenius—Ruelle, xxxiv, xliv, 4, 
8, 9, 19, 21, 26, 33, 43, 48, 49, 57, 61, 
86, 87, 95, 97, 107, 155, 200, 254 
positive, 117, 254 
positive semidefinite, 114 
— process, 116 
— product, 115, 218 
projection, see projection 
row, 210 
scaling, xx, 2, 3, 10, 109, 162, 180, 181, 
187, 188, 190, 200, 207-209, 255 
unitary, 163, 168, 194, 196 
selfadjoint, 138 
semidefinite, 114 
shift, 213, 256 
— theory, xix, xxvi, xxix, 10, 37, 58, 109, 
138, 142, 154, 170, 180, 181, 186, 
210-212, 216, 221, 222, 229, 230 
transfer, xxxiii, xliii, xliv, 4, 19, 26, 33-35, 
57, 87, 115, 254, see also operator, 
Perron—Frobenius—Ruelle 
wavelet, xxxiii, 33, 105, 107 
transition, xxxiv, 8, 9, 21, see also 
operator, Perron—Frobenius—Ruelle 
wavelet, xxxiv, 21 
unitary, xix, 158, 161, 169, 179-183, 186, 
188-190, 210, 211, 218, 219, 221, see 
also factorization of unitary operators; 
function, operator, unitary 
—--valued measure, see measure, 
operator-valued 
zero-kernel, 116 
ordering, 51, 209 
orthogonal, 198 
— basis, see basis, orthogonal 
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— complement, 190, 257 
— decomposition, see decomposition, 
orthogonal 
— expansion, see expansion, orthogonal 
— function theory, 155 
— measure, see measure, orthogonal 
— multiresolution, see multiresolution, 
orthogonal 
— projection, see projection, orthogonal 
— sum, 210, 257 
— vectors, 181, 190 
— wavelet, see wavelet, orthogonal 
orthogonality, 17, 129, 197 
filter, 87 
— relations, 13, 57 
wavelet, xxxii-xxxiv, 5 
orthonormal basis, see basis, orthonormal 


p-Bernoulli product measure, see measure, 
p-Bernoulli product 
p-subinterval, 166 
Parseval 
— basis, see basis, Parseval 
— frame, see frame, Parseval 
— identity, 13, 32, 66, 99, 103, 104, 137, 
162, 168, 191, 193, 210, 228 
— system, 13 
— wavelet, see wavelet, Parseval 
partial isometry, see isometry, partial 
partition, 166 
non-overlapping, 165, 186 
path, 7, 87, 123, 124, 127, 129 
— space, see space, path 
perfect reconstruction, 87, 124, 132, 205 
periodic function, see function, periodic 
permutation of bases, see basis, permutation 
of 
permutative representation, see representa- 
tion, permutative 
Perron, xliii 
Perron—Frobenius—Ruelle theory, 99, see 
also eigenfunction, Perron—Frobenius; 
eigenspace, Perron—Frobenius; 
matrix, Perron—Frobenius; operator, 


Perron—Frobenius—Ruelle theory............. 


Perron—Frobenius—Ruelle; theorem, 
Perron—Frobenius 
phase modulation, 198 
phase transition, 33, 34, 87 
physics, xxviii, xxxi, xxxili, xxxv, 33, 154, 
see also quantum physics 
mathematical, 6 
pixel, 33 
Plancherel formula, 164 
pointwise convergence, see convergence, 
pointwise 
Poisson 
— integral, 50 
— measure, see measure, Poisson 
polar decomposition, 150, 201 
positional number system, xx, 33, 40, 69 
Powers—Stormer, 138, 154 
probability, xxviii-xxxi, xxxiii, xliv, 6, 17, 
18, 33-37, 84, 87, 88, 108, 125, 168 
combinatorial — theory, see combinatorial 
probability theory 
conditional, 45, 254 
— distribution, Gaussian, see distribution, 
Gaussian 
free, 177 
—— measure, see measure, probability 
non-commutative, 37, 177 
— space, see space, probability 
transition, see transition probability 
process, xxVili, xxx, 48, 49, 86 
branching, 5, see also branching 
Markov, see Markov process 
operator-valued, see operator process 
processing, see signal processing; image 
processing 
product, 76, 114, 115, 207, see also infinite 
product 
Cartesian, 43, 257 
Cauchy, see Cauchy product 
infinite, see infinite product 
infinite-—- measure, see measure, 
infinite-product 
inner, 9, 75, 114, 193, 212, 257 
matrix, see matrix product 


— measure, see measure, product 
operator, see operator product 
random, 34, 84 
Riesz, see Riesz product 
tensor, xxvii, 6, 56, 58, 109, 142, 147-149, 
151, 152, 158, 165, 168, 179, 180, 186, 
189, 257, see also infinite product, 
tensor 
projection, 10, 55, 94, 112, 117, 173, 216, 
228 
final, 173 
initial, 173 
orthogonal, 10, 135, 141, 167, 216 
—-valued measure, see measure, 
projection-valued 
pure, 169 
pyramid, 125, 127, 158, 159, 173, 188, 226 
— algorithm, see algorithm, pyramid 
dyadic, 159 
singly generated, 159 


quadrature, xxix, 131, 227, 228 
— mirror, see filter, quadrature-mirror 
quantization, xxxi, 224-226 
quantum 
— field theory, xxix, 154 
—-mechanical state, see state, quantum- 
mechanical 
— particle, 154 
— physics, xxix, xxxii, 37 
— statistical mechanics, see statistical 
mechanics, quantum 
— theory, 131 
quarter 
— Cantor set, see Cantor set, quarter 
— division, 21 
quasi-free state, see state, quasi-free 


R-harmonic function, see harmonic function, 
R- 
R-invariant, see invariant, R- 
Radon 
— measure, see measure, Radon 
—-—Nikodym derivative, see derivative, 
Radon-Nikodym 


random, 18, 59, 83 
— process, 36, 55, 204 
— product, see product, random 
— variable, xviii, 55-57 
A-, xxiv 
Gaussian, 56, 57 
— walk, vi, xviii, xxiv, Xxxvili-xxx, xxxiv, 
xliti, 4-7, 16, 21, 34, 39, 40, 42, 
48, 57, 77, 83, 84, 87, 100, see also 
process 
— model, viii, 2, 8, 9, 12, 21, 26, 40, 41, 
83, 98, 136 
— on branches, xxviii, xxx, 70 
— on fractal, 21 
range subspace, 117, 162 
reconstruction, see perfect reconstruction 
recursive, 129, 131 
— algorithm, see algorithm, recursive 
— system, 197 
redundancy, 228, 229 
refinement, 4 
— equation, 111 
regularity, xxxiv, 5, 80, 87, 107 
Dini, 4 
Lipschitz, 4 
renormalization group, see group, 
renormalization 
renormalize, xxix 
representation, xxix, 124, 140, 165, 172, 
190, 197, 214, 254-256 
base-point, 219 
boundary, see boundary representation 
irreducible, 183 
matrix, 140, 218 
N-adic, 90, 101, 256 
— of algebraic structure, xxix, 37 
— of CAR algebra, 138 
— of Cuntz algebra, 5, 22, 131, 136, 139, 
152, 154, 155, 158, 161-164, 167, 
168, 170, 174, 179, 180, 182, 183, 
187, 192, 196, 205, 211, 214, 218, 
219, 227 
— of Z by translation, 111 
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permutative, 6, 180, 184, 185, 189, 190, 
194 
spectral, 112 
subband, 219 
— theory, 154, 155, 180 
unitary equivalence of, 184 
wavelet, 102, 182 
reproduction formula, 144 
resolution, xix, xxii, xxvi, xxix, xxx, 9, 10, 
22, 33, 111, 147, 148, 181, 225, 230, 
see also multiresolution 
— subspace, xxvi, 23, 109, 110, 123, 126, 
152, 180, 207, 255 
multiply generated, 114 
visual, xvi, xviii, xx, xxii, 40, 225 
Riesz 
— product, 84, 97 
—’s theorem, see theorem, Riesz’s 
row-contraction, 217, 222 
D. Ruelle, xliii, 33, 34, 87, 155 
Ruelle, see also eigenfunction, Perron— 
Frobenius; eigenspace, Perron— 
Frobenius; matrix, Perron—Frobenius; 
operator, Perron—Frobenius—Ruelle; 
Perron—Frobenius—Ruelle theory; 
theorem, Perron—Frobenius 
— measure, see measure, Ruelle 
— operator, see operator, Perron—Frobe- 
nius—Ruelle 
—’s theorem, see theorem, Ruelle’s 


sampling, 17, 18, 21, 36, 37, 213, 215 
down-, 87, 124, 128, 132, 133, 205, 206, 
212, 213, 257 
Shannon, xxxi, 18 
— theory, 5, 18 
up-, 87, 124, 132, 205, 206, 212, 213, 257 
scale-N 
— wavelet, see wavelet, scale-N 
scale number, 19, 40, 70, 147, 190, see also 
scaling number 
scaling, xxii, xxiii, 3, 22, 23, 25, 69, 80, 99, 
100, 110, 111, 123, 142, 181, 186, 
188, 195, 198, 208 
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— dimension, see dimension, scaling 
dyadic, 2, 3, 110, 162, 181, 192, 200, 208, 
209, 255 
— equation 
wavelet, 25 
— function, xx, xxx, 3, 4, 12-14, 16, 23, 
25, 64, 111, 134, 207, 256 
Daubechies, xxxv, 12, 13, 125 
Haar, xxxv, 13, 14, 103 
stretched Haar, 12, 13, 103 
— identity, xx, 3, 10, 14, 15, 17, 25, 30, 
91, 102, 103, 109, 111, 123, 130, 131, 
134, 199, 209 
Cantor, 14 
— in the large, 14 
— number, 143, 146, 147, 188, 208 
— operator, see operator, scaling 
— relation, 3, 75 
—-similarity, 9, 33, 180 
— transformation, 9, 66, 79, 80 
fixed, 3, 9, 10, 22, 66, 188 
Schmidt 
Gram-, see algorithm, Gram—Schmidt 
—’s decomposition theorem, see theorem, 
Schmidt’s 
segment, 165, 166, 186, 257 
self-similarity, xxix, xxxv, 9, 48 
separation of variables, 158, 179 
C.E. Shannon, xxxi, 18 
Shannon sampling, see sampling, Shannon 
shift-invariant, see invariant, shift- 
o-additive, see measure, o -additive 
o-algebra, xviii, xxiv, 27, 28, 37, 40, 47, 94, 
149, 204, 251-253, 255 
Borel, xxiv, 43, 251 
—s, countable family of, 37 
sub-, xxiv, 37 
tail-, 37 
o-invariant, see invariant, o - 
signal analysis, xvi, xxvi, 40, 223 
signal processing, vi—viii, xv, xvi, xix—xxii, 
XXVi, XXVili, Xxxi-xxxVii, xliv, 4, 6, 10, 
18, 23, 25, 26, 33, 37, 39, 87, 123-125, 
130, 131, 135, 155, 181, 187, 189, 
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205, 206, 210-212, 215, 216, 218, 
219, 223, 224, 227-229, 231, see also 
image processing; speech signal 
singly generated, xxx, 159, 160, 173 
six-tap, 22, 146 
slanted, see matrix, slanted 
space 
compact Hausdorff, 101 
detail, 33, 123, 255 
Euclidean, 22 
ambient, 81 
function, xxii, xxvi, 9, 22, 59, 109, 130, 
142, 146, 147, 157, 158, 179, 190, 
207, 210, 228, 251 
ambient, 3 
Hilbert, xviii-xxi, xxiii, xxvii, xxxvi, 3, 5, 
6, 10, 14, 15, 22, 28-31, 33, 36, 37, 
58, 71, 79, 87, 93, 97, 109, 110, 114, 
136, 139, 140, 142-144, 147-152, 
157, 158, 161, 162, 165, 169, 170, 
172, 174, 176, 180-184, 186, 188, 
189, 196-199, 204, 207, 210, 212, 
213, 217, 218, 221, 222, 228-230, 
251, 252, 254, 257 
ambient, 25, 26, 142, 152, 188, 207, 229 
complex, 58, 114, 143, 144, 173, 
182-184, 203, 252 
dilated, 229, 230 
fractal, 14, 17, 72, 97, 110, 158, 196, 
198 
— geometry, xxvi, 109, 142, 148, 179, 
198, 207, 221, 230 
operators in, xxxii, 6, 37, 131, 138, 142, 
143, 161, 177, 180, 183, 209-211, 
216, 218, 221, 251 
symbolic, 110 
path, xxix, 5, 6, 34, see also measure, 
path-space 
probability, xviii, xxiv, 5, 7, 9, 21, 22, 37, 
47, 50, 55, 56, 60, 70, 114, 135, 203, 
233, 255, 256 
sparse matrix, see matrix, sparse 
spectral 
— analysis, 158, 181 
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joint — radius, xxxiii 
— measure, see measure, spectral 
— pair, 36, 71, 74 
— representation, see representation, 
spectral 
— theorem, see theorem, spectral 
— theory, xliv, 37, 57 
— transform, 110 
spectrum, 34, 138, 254 
peripheral, 57 
speech signal, xxxi, 124, 205, 227 


State, xviii, xxiii, 6, 26, 37, 139-141, 151, 


154, 157, 256 
equilibrium, 154 
multi-, 139 
— on graph configuration, 35 
quantum-mechanical, xviii, xxxii, 154 
quasi-equivalent, 154 
quasi-free, 138, 139, 154 
statistical mechanics, 33, 34, 42, 87, 154 
quantum, 34, 154 
statistics, Xxxi, xxxiii, 154 
Stone—WeierstraB theorem, see theorem, 
Stone—WeierstraB 


stretched Haar wavelet, see wavelet, Haar, 


stretched 
R. Strichartz, 26, 76, 80, 87 
subband, 23, 124, 205, 215, 227 
— coding, xxxi, 123 
— filter, see filter, subband 
frequency, see frequency subband 
— representation, see representation, 
subband 
subdivision, xxxii, xxxv, 4, 123, 124, 195 


— algorithm, see algorithm, subdivision 


dyadic, xxxv 

Mallat, see Mallat subdivision 
subinterval, 167 

dyadic fractional, see dyadic fractional 

subinterval 

N-adic, see N-adic subinterval 

p-, see p-subinterval 
subspace 

invariant, see invariant subspace 


iv ete eee eee transfer operator 


resolution, see resolution subspace 
substitution, iterated, 34 
symbolic dynamics, see dynamics, symbolic 
synthesis, 124, 132, 205, 206, 227, see also 
analysis (engineering) 


tap number, 147, see also two-tap; four-tap; 
six-tap 
tensor, 140, 148, 150, 162 
— factorization, see factorization, tensor 
— product, see product, tensor 
theorem 
dominated convergence, 65, 79 
Fatou—Primalov, 48, 50 
“Fatou’s lemma’, 101 
Fubini’s, 191 
Karhunen—Loéve, 55, 58, 148, 204 
Kolmogorov’s, 204 
“Kolmogorov’s lemma”, 48, 149 
martingale convergence, 21, 48, 51, 92 
Perron—Frobenius, 34, see also Perron— 
Frobenius—Ruelle theory 
Riesz’s, 37, 45, 46, 141 
Ruelle’s, xxxiv, see also Perron— 
Frobenius—Ruelle theory 
Schmidt’s, 58, 148, 150 
“Schwarz’s inequality”, 31 
spectral, 39, 55, 58, 110, 112, 150, 201 
Stone—WeierstraB, 27, 39, 43-46, 79 
Szego’s limit, 154 
uniqueness, 86 
“Zorn’s lemma”, 172 
tiling, 87, 124, 165, 166, 182, 185-189 
dyadic, 187 
self-affine, 100 
time-localized function, see function, 
time-localized 
torus, 25, 204, 251, see also one-torus 
trace 
— formula, 34 
normalized, 184 
traditional wavelet setup, xxx, 4, 6, 10, 25, 
26, 71, 110, 112 
transfer operator, see operator, transfer 


transformation ............0 0.00000 cece eee eee 


transformation 
— group, see group, transformation 
— rule, 165, 166, see also basis 
transformation 
transition 
—— measure, see measure, transition 
— operator, see operator, transition 
— probability, xxxiv, 5, 9, 12, 17, 19, 21, 
34, 37, 39-41, 48, 50, 59, 62, 63, 70, 
254 
translation, 110, 111, 198 
integer, 126, see also integral translates 
—-invariant, see invariant, translation- 
tree, xxxiv, 5, 42, 87, 124, see also com- 
binatorial tree; decision tree; Farey 
tree 
two-cycle, 160 
2-fold branch mapping, see branch mapping, 
2-fold 
two-tap, 146 
Tychonoff infinite-product topology, 7, 43 


uniqueness theorem, see theorem, 
uniqueness 
unit interval, 62, 90, 137, 139, 166 
unitarity, 132, 174, 175, 183, 191, 220 
unitary, xix 
— equivalence, 151, 163, 169, 184 
— extension principle, 107, 222 
infinite-dimensional — group, see group, 
infinite-dimensional unitary 
— isomorphism, see isomorphism, 
unitary 
— matrix, see matrix, unitary 
— operator, see operator, unitary 
— scaling operator, see operator, scaling, 
unitary 
up-sampling, see sampling, up- 


variable 
dual, see dual variable 
Fourier-dual, see Fourier dual 
random, see random variable 
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walk, 6, 41, 128, see also transition; random 


walk 


wavelet, vi, vii, xv—xvii, xix, Xx, XXV—XXx, 


XXXII-XxXVi, xliii, xliv, 1-7, 9, 14, 16, 
17, 22, 23, 25, 33-37, 39, 40, 57, 58, 
64, 71, 79-81, 83, 87, 91, 99, 100, 
104, 105, 107, 109, 111, 122, 130, 
131, 142, 151, 152, 155, 158, 179, 
182, 189, 190, 211, 222-230, 235, 
256, see also traditional wavelet setup 
— algorithm, see algorithm, wavelet 
— analysis, xv, xx, xxviii, xxx, xxxii, 
xxxiii, xliii, 4, 6, 10, 34, 57, 60, 84, 98, 
99, 109, 181, 188, 208, 210, 225, 226 
band-limited, 4 
— basis, see basis, wavelet 
— coefficient, see coefficient, wavelet 
compactly supported, xxx, 14 
— construction, xix, xxix—xxxi, xxxiii, 1, 
4, 5, 7-9, 13, 14, 36, 57, 83, 97, 119, 
121-124, 130, 131, 136, 157, 179, 
180, 192, 198 
Daubechies, 5, 14, 121, 122, 124, 127, 
128, 134-136, 228 
— decomposition, see decomposition, 
wavelet 
dual, xxxii 
dyadic, 15, 39, 99, 123-125, 146 
— expansion, 230 
— filter, xxxii, xxxiii, 5, 23, 33, 37, 83, 
87, 112, 122, 130, 136, 227, 230 
matrix-valued, 22 
fractal, 15, 71, 180 
frame, see frame wavelet 
frequency-localized, xxx, 7 
— function, vi, xvi, xx, xxvii, xxx, 1, 16, 
102, 103, 123, 134, 256 
Daubechies, 110 
Haar, 103 
gap-filling, 22, 72 
Haar, xvi, xxxi, 5, 12-14, 99, 102, 103, 
106, 119, 124, 127, 128, 131, 148, 
192, 229 
dyadic, 136, 137 
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stretched, 12, 13, 15, 16, 99, 100, 104, 
106, 252 
— matrix, see matrix, wavelet 
multi-, see multiwavelet 
multiresolution, xxx, 10, 16, 97, 110, 142, 
181, 190 
N-adic, 64, 123 
orthogonal, 124 
— orthogonality, see orthogonality, 
wavelet 
— packet, vi, xxxiv, xxxv, 4, 6, 22, 35, 
110, 117, 122-126, 129-131, 136, 
142, 153, 159, 176, 180, 187-189, 256 
algorithm, see algorithm, wavelet packet 
dyadic, 127, 158, 159 
Parseval, 13, 16, 103, 105 
— representation, see representation, 
wavelet 
scale-N, 64, 65, 190, 252 
— theory, xxix, 9, 33, 87, 110, 168, 177, 
222-224 
time-frequency, xxxi 


Zorm’s lemma 


time-scale, xxxi 
traditional — setup, see traditional 
wavelet setup 
— transfer operator, see operator, transfer, 
wavelet 
— transform, 142, 186 
discrete, 142, 180, 202 
— transition operator, see operator, 
transition, wavelet 
wavepacket coefficient, see coefficient, 
wavepacket 
WeierstraB 
Stone-— theorem, see theorem, 
Stone—WeierstraB 
weight function, see function, W- 
M.V. Wickerhauser, 117, 176, 177 
N. Wiener, 34 
Wold decomposition, see decomposition, 
Wold 


zeta function, see function, zeta 
Zorn’s lemma, see theorem, “Zorn’s lemma” 
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